12 Comments

  1. ecarlson

    Question came in:
    Our team had a couple of additional questions we’d like to ask:
    1) There were some instances where the same diagnosis description has different diagnosis codes. We noticed these occurrences when counting the number of times a diagnosis code was associated with incidences of presumptive or definitive procedure codes (7 codes listed by MITRE as problematic). There are 2 of the 10 descriptions “COUMADIN THERAPY” and “DRUG AND ALCOHOL ADDICTION” are each mapped to more than 1 different diagnosis code. The team needs to know if we need to assess these incidences as errant or fraudulent.
    2) Likewise, there is one instance where a unique diagnosis code references 2 different descriptions. Diagnosis code 12.131 has 2 descriptions which are ‘Coumadin Therapy’ and ‘Dehydration’. Of note, ‘Coumadin Therapy’ as a diagnosis description has 2 different diagnosis codes. From a data perspective, this is problematic; however, the question is whether or not this suspicious data represents fraud. It will be helpful if we can get a better understanding of whether we need to investigate these further. If they are errors, then we will begin to search elsewhere.

    • ecarlson

      1) In this dataset you may see different diagnosis codes with the same or similar descriptions. Diagnosis descriptions in the data are the short descriptions. It may be that the deciphering language would be found in the long definition. In this dataset this is should be considered a normal finding and should not be considered in your evaluation of the data looking for patterns of fraud, waste or abuse.

      2) Real life data would not show the same diagnosis code with different descriptions. If, or when seen in this data, it should be considered a flaw of the synthetic dataset and should not be considered in your evaluation of the data looking for patterns of fraud, waste or abuse.

  2. ecarlson

    Good try – but pencils down as we have a winner of our Flash Challenge – Team Rutgers EMBAs!
    At 11:29 a.m. Team Rutgers EMBAs responded finding these two providers located in Yellowstone National Park (MT) and Shark Valley Visitor Center of Everglade National Park (FL):

    prov_id prov_name prov_address prov_city prov_zip prov_state n_records n_patients sum_paid
    PROV016981 RODNEY21 KOCH169 30 YELLOWSTONE AVE WEST YELLOWSTONE 59758 MT 344 75 $279,133.25

    PROV059201 MARIA DEL CARMEN27 MERCADO213 36000 SW 8th St MIAMI 33194 FL 733 272 $415,959.41

  3. ecarlson

    In the spirit of holiday shopping season we are offering a flash challenge – the members of the team that solves the challenge will each win a $50 Amazon gift card.

    Here is the challenge:

    2 provider’s addresses are actually visitor centers in state or national parks. Tip: these providers may not be related to ordering/using drug screen tests.
    Find the two providers.

    The first team to identify the 2 providers and submits their answers to hcfchallenge@mitre.org wins. Good luck!

    • ecarlson

      Clarification: Address includes Prov_Address, Prov_City, Prov_State and Prov_Zip

  4. ecarlson

    In the spirit of the holiday season we are offering you a chance to win a flash challenge! This afternoon, December 5 at 5:00 p.m. ET team captains will be sent a question to solve using the data, first team who successfully finds both answers and emails the Competition mailbox with their answers wins – and the winning team members will each receive a $50 Amazon gift card! If your team captain is unavailable at 5 today – email a proxy.

  5. Tongtong Huang

    Are there any specific rules to generate the diagnosis codes?

    • ecarlson

      For the UDS records we used the ICD-10 description (not the code) but for some of the other synthetically created data, other descriptions may have been used that our team did not develop or have insight into.

  6. Rutgers EMBAs

    Does the dataset have CPT codes? How do we know which code corresponds to the billing rate?

    • The MITRE Competition Team

      There are no CPTs used in this dataset as the CPT code set copyrighted and maintained by AMA and we were unable to obtain permission to use the codes for this competition. However, the equivalent code is the procedure code column. Usually CPT and HCPCS codes are used in conjunction on health care claims, and they are both the procedure codes to document the types to medical services being provided to the patients. We created a list of procedure codes based on the description of the medical services. It includes both medical services and urine drug tests. Please use the procedure description column to understand the services for this patient on a particular claim line and the procedure code as the code for the description.

  7. The MITRE Challenge Team

    Office Hours Tomorrow at 4 p.m. ET! Join Amber, Song and Alanna who will be available to answer your questions about the data, UDS fraud, or about the competition in general. Dial in is: +1 (781) 271-2020,,3292673# ##

  8. drozdetski@mitre.org

    Have questions about the Competition? Please let us know!

Submit a Comment