Health care providers

Towards Quantitative Medicine

The application of data mining techniques for knowledge discovery holds big promise. Data mining algorithms have been successfully applied in many different application areas, including but not limited to, retail, telecommunications, and more. However, applying these methods in the medical domain has its challenges because the data sets are often very large and complex, with numerous rare variables such as diagnosis, procedures and drugs. These variables are rare as most people are healthy (and therefore have few diagnosis, procedures and drugs), and those who are not, suffer from a wide variety of conditions (and therefore relatively few people have the same set of conditions).

The data used in this assignment is based on claims data. Claims data is generated when hospitals, pharmacies and other health care providers send claims to third party payers to receive reimbursement for their services. The claims data include information on diagnoses, procedures and drugs prescriptions as well as place of service, and patient’s age, gender and ZIP-code. A collection of members’ claims has the benefit of giving a “bird’s eye view” of patients’ health care.

Diabetes is a disease in which the body does not produce or properly use insulin. Diabetes is a chronic disease, and it is believed that over 18 million Americans suffer from it. Effective disease management of this group, and advancing understanding of the disease, is therefore of great importance.

The data (members.xlsx) used in this study has information on over 17,000 diabetic patients. We have data on their diagnosis, procedures and drugs over a 12 month observation period (as well as some other details from the observation period) together with their overall health care costs in the 12 months following the observation period, called the result period. We will apply Association Rules to gain insights into diabetes, risky health care patterns and exploratory data-mining in general.

Associations Rule Analysis

1. For each of the diagnosis columns, count the number of members with the diagnosis.

a. Provide a readable (translate the diagnosis codes into actual conditions) table as an exhibit of the top 10 diagnosis and their counts.

2. Run the association rule algorithm in XLMiner, on diagnosis information only (columns Diag_DD0002 through Diag_DD0266 – leave out Diag_DD0000), with all the default parameter settings.

a. What is the average confidence of the rules created?

b. What is the average lift of the rules created?

c. Briefly explain the reasons behind the average confidence and the average lift.

3. One of the difficulties with creating good rules with medical data is that most diseases are rare; therefore setting the support too high will create uninteresting rules, if any at all. Run the association rule algorithm again with default confidence setting, but changing the minimum support setting, setting it first to 174, and then to 17.

a. How many rules were created when the minimum support is 174?

b. How many rules were created when the minimum support is 17?

4. Analyze the three rules that have the maximum lift (you may need to order your rules) when the minimum support is set to 174. Provide a brief interpretation of these three rules and explain why all of them have the same support, confidence and lift.

5. Delete Diag_DD0046 from the data (or otherwise exclude it from the analysis). Rerun the Association Rules algorithm with a) default settings, b) minimum support set to 174, c) minimum support set to 17.

a. How many rules were created when the minimum support is 174?

b. How many rules were created when the minimum support is 17?

6. Out of the latter two runs, select the rule with the highest lift ratio. Explain the rule in words.

a. Hypothetically, if you were a doctor and a diabetic patient walks in already diagnosed with the rules antecedent diagnosis, how could the rule potentially guide your work (with all the simplifying assumptions needed to answer this question!).

7. One of the main reasons behind the interest in data-mining in health care is the hope that through intervention and prevention, one can help reduce health care costs by identifying patients early who are at risk of high health care cost. In order to use association rules to contribute to this goal, it is not enough to run association rules on the whole data set, as there is nothing that distinguishes between costly patients and not costly patients. In order to use association rules to distinguish high risk patients from low risk patients, we need to identify rules that have good support and high confidence on high cost population, but low support for the not-high-cost population. In particular, we are interested in identifying rules of the type:

group of diagnosis codes -> High cost in a future period

The variable TA2 contains the overall cost in the year following the observation period. Define a new variable that equals one if the overall cost is ≥ $40,000, and zero otherwise.

Run the Association Rule algorithm again using all diagnosis variables (continue to exclude the diabetes diagnosis and Diag_DD0000) and the new variable.  Set the support as low as possible (but no smaller than 15) and the keep the confidence at the default setting. Sort the resulting rules by “consequent (c)”, and identify rules of the form above.

a. What was your support setting?

b. How many rules did you identify?

8. Create an exhibit that summarizes the rules (translate the diagnosis codes into actual conditions), their support, confidence and lift.

a. Briefly discuss the main characteristics of the rules.

9. Renal Failure is in general a very expensive condition to treat, as either the patient needs to be treated with dialysis or undergo a transplant, which is a costly surgery. It is therefore not surprising that some of the rules above include Renal Failure.

a. In the population of just over 17,000, how many suffer from renal failure?

b. Out of those that suffer from renal failure, how many have high costs following the observation period?

c. Based on the above answers, why did the Association Rule Analysis not give us the rule:

Renal Failure ->High Cost ?

Calculate your paper price
Pages (550 words)
Approximate price: -

Quality Researched Papers

We always make sure that writers follow all your instructions precisely. You can choose your academic level: high school, college/university or professional, and we will assign a writer who has a respective degree.


Qualified Writers

We have hired a team of professional writers experienced in academic and business writing. Most of them are native speakers and PhD holders able to take care of any assignment you need help with.


Unlimited Revisions

If you think we missed something, send your order for a free revision. You have 10 days to submit the order for review after you have received the final document. You can do this yourself after logging into your personal account.


Prompt Delivery

All papers are always delivered on time. In case we need more time to master your paper, we may contact you regarding the deadline extension. We will always strive to deliver on time.


Original & Confidential

We use several writing tools checks to ensure that all documents you receive are free from plagiarism. Our editors carefully review all quotations in the text.


24/7 Customer Support

Our support agents are available 24 hours a day 7 days a week and committed to providing you with the best customer experience. Get in touch whenever you need any assistance.

Try it now!

Calculate the price of your order

Total price:

How it works?

Follow these simple steps to get your paper done


Place your order

Fill in the order form and provide all details of your assignment.


Proceed with the payment

Choose the payment system that suits you most.


Receive the final file

Once your paper is ready, we will email it to you.

Academic Writing Services

If you are looking for exceptional academic writing services, then you are in the right place. Studyacer offers customised academic assignments and academic papers for students at all levels.

StudyAcer StudyAcer


Essay Writing Help

Writing an essay can be a challenge. However, we know what every student needs. And we purpose to deliver. Here at StudyAcer, we do all we can to help with academic essays and assignments. We have a dedicated team of professional essay writers.

StudyAcer StudyAcer


Assignment Help

We understand students need satisfactory results. Our cheap assignment writing service helps and never leaves any doubt. We always strive to ensure the ultimate and best results. It is our joy to offer a cheap reliable essay writing service.

StudyAcer StudyAcer

Term paper

Term Paper Help

Have you been asking yourself, where can get a pre written research papers for sale? Worry no more, the fact that you have a term paper that is due tonight and you haven’t touched it. At Study Acer it is our responsibility to get your paper on time.

StudyAcer StudyAcer


Dissertation Writing Service

Several master’s students seek professional help with their thesis.Students from different parts of the world experience different challenges. Dissertations have different stages and the challenges are different too. Do not struggle in silence, order now .