Data contents

The inpatient dataset contains the following variables:

Variable Description
DRG Definition The code and description identifying the MS-DRG. MS-DRGs are a classification system that groups similar clinical conditions (diagnoses) and the procedures furnished by the hospital during the stay.
Provider Id The CMS Certification Number (CCN) assigned to the Medicare certified hospital facility.
Provider Name The name of the provider.
Provider Street Address The provider’s street address.
Provider City The city where the provider is located.
Provider State The state where the provider is located.
Provider Zip Code The provider’s zip code.
Provider HRR The Hospital Referral Region (HRR) where the provider is located.
Total Discharges The number of discharges billed by the provider for inpatient hospital services.
Average Covered Charges The provider’s average charge for services covered by Medicare for all discharges in the MS-DRG. These will vary from hospital to hospital because of differences in hospital charge structures.
Average Total Payments The average total payments to all providers for the MS-DRG including the MS-DRG amount, teaching, disproportionate share, capital, and outlier payments for all cases. Also included in average total payments are co-payment and deductible amounts that the patient is responsible for and any additional payments by third parties for coordination of benefits.
Average Medicare Payments The average amount that Medicare pays to the provider for Medicare’s share of the MS-DRG. Average Medicare payment amounts include the MS-DRG amount, teaching, disproportionate share, capital, and outlier payments for all cases. Medicare payments DO NOT include beneficiary co-payments and deductible amounts nor any additional payments from third parties for coordination of benefits.

Data exploration (continued)

In the prior classes, we cleaned and tidied the dataset and then started the exploration process. The following code comes from the demos we worked through in class and are what we used to clean up the dataset:

inpatient_tidy <- inpatient %>%
  separate(col = `Hospital Referral Region (HRR) Description`,
           into = c("State2", "Provider_HRR"),
           sep = " - ", remove = TRUE) %>%
  mutate(Provider_HRR = str_to_upper(Provider_HRR)) %>%
  select(-State2)

inpatient_tidy2 <- inpatient_tidy %>%
  separate(col = `DRG Definition`, into = c("DRG_Code", "Diagnosis"),
           sep = " - ", remove = FALSE)

In addition, we computed the ratio \(\dfrac{\text{Average Medicare Payments}}{\text{Average Covered Charges}}\) to compare the amount a hospital billed for a procedure versus how much Medicare actually paid as reimbursement. The code was as follows:

inpatient_with_reimbursement <- inpatient_tidy2 %>%
  mutate(reimbursement_fraction = `Average Medicare Payments` / `Average Covered Charges`)

Please check the code in your RMarkdown file and verify that you have the above code snippets. You will need the dataset assigned to inpatient_with_reimbursement for the following exercises.

Question: Do all hospitals across the country charge Medicare approximately the same amount for joint replacements? Does Medicare reimburse uniformly for the same procedure?

This dataset lets us explore the question of whether billing and Medicare reimbursement for a specific procedure is uniform across the country, or if there are variations that may be due to other factors, such as hospital location. Let’s focus in on one of the more expensive procedures, joint replacements. Joint replacements coded as a secondary diagnosis that is a major complication or comorbidity (MCC) are labeled under Diagnosis as MAJOR JOINT REPLACEMENT OR REATTACHMENT OF LOWER EXTREMITY W MCC.

Exercise 1

Filter down the dataset so that only rows with MAJOR JOINT REPLACEMENT OR REATTACHMENT OF LOWER EXTREMITY W MCC under the Diagnosis column remain. Write one line of code that does this and assign this dataset to the variable inpatient_joint_replacement.

Exercise 2

Calculate the average reimbursement_fraction in each state. Then, determine the state with the highest reimbursement_fraction average and the state with the lowest reimbursement_fraction average.

Hint: Group the dataset by states, then figure out how to use the mean() command with the summarize() to compute this quantity.

Exercise 3

After identifying the state with the lowest reimbursement_fraction average (let’s call this state_lowest_fraction), go back to inpatient_joint_replacement and filter the data again. This time, we want a dataset that only has rows corresponding to a joint replacement diagnosis procedure in the state state_lowest_fraction. Run the filter and assign the result to a variable.

Extra exercises

If you complete the above, then continue on with the following to give yourself additional practice.

  1. In the filtered dataset from exercise 3, group by Provider_HRR and investigate the values of Average Covered Charges and Average Medicare Payments within each HRR region. Some ways to do this include running a summary report using max() and also min() on these columns, which shows the range of billed prices and reimbursed amounts. You can also compute the averages for each region. Does there seem to be any uniformity to the price range from region to region? How about within a region?

  2. Choose another diagnosis and repeat the analysis in exercises 2 and 3. Sort the average reimbursement_fraction per state from high to low and compare with your result in exercise 2. For the two different diagnoses, does the sort order of reimbursement_fraction follow a similar pattern or is it completely different?

Clean-up

Don’t forget to save, stage, commit, and push your code to Github before leaving class! You have until class (3:00pm) on Wednesday, October 11 to complete exercises 1 through 3 and push to Github.