Comparing effectiveness of two medications by evaluating negative symptoms in patients diagnosed with schizophrenia

April 1, 2022

Schizophrenia is a cognitive and behavioral disorder characterized by the presence of a wide range of psychotic symptoms which can be classified as positive or negative [1]. Positive symptoms reflect an excess or distortion of normal functions whereas negative symptoms diminish the normal functions of the patient in relation with his motivation and interest [2]. Moreover, evidence has shown that negative symptoms are strong predictors of poor outcomes in schizophrenia [3].

Although there are more specific definitions to classify predominantly negative symptoms in the literature, there is currently no consensus on the definition [4].  Currently, the evaluation of negative symptoms is done using standardized assessment tools such as the Positive and Negative Syndrome Scale (PANSS) [5], the Scale for the Assessment of Negative Symptoms (SANS) [6] and the Negative Symptom Assessment-16 (NSA-16). Of these, the PANSS is widely considered as the gold standard [7], however, the main disadvantage of these tools is that they are too complex and time-consuming to complete, hence making them difficult to use in standard clinical practice [8]. For this reason, the presence of systematic measures for negative symptoms in electronic health records (EHRs) is limited to nonexistent.

To overcome this, at Holmusk we have developed an advanced Natural Language Processing (NLP) algorithm that transforms semi-structured free text summaries of mental state evaluations (MSE) into quantitative values that can be used to measure and evaluate a patient’s mental state at a determined time point [9]. Considering that we have the biggest real-world behavioral health database encompassing more that 920,000+ patients and 75+ million rows of data (including millions of drugs and diagnostic data), the possibilities are endless. From developing clinical studies to compare the effectiveness of a drug with real patients, to developing machine learning models to assess the risk of a patient developing certain symptoms, there are many ways to develop real-world behavioral health impact.

In this article, we will showcase how we can use NeuroBlu to compare the effectiveness of two drugs in patients with schizophrenia. For this, we will evaluate the clinical global impression scale severity (CGI-S) and specific negative symptoms captured from the MSE with Holmusk’s proprietary NLP algorithm. To achieve this, we will perform the following steps on NeuroBlu:

1. Build the study cohort

2. Create a category mapper project with Schizophrenia ICD codes

3. Define the MSE symptoms to be analyzed

4. Define the composite score that will be used to evaluate the MSE symptoms

5. Run the configurable script code

Build the study cohort

First, we will select adult patients diagnosed with schizophrenia. NeuroBlu has a user-friendly graphical interface that allows using inclusion and exclusion criteria to create the study cohort:

Create a category mapper project with Schizophrenia ICD codes

In NeuroBlu, disorders are stored using ICD-9-CM and ICD-10-CM codes. We provide a graphical user interface called Category Mapper that helps build and group values (e.g. ICD diagnosis codes) to create custom categories. Category Mapper includes template projects that can be duplicated, after which they will appear in the user's project list and will be ready for use. For this article, we will duplicate the ‘Example Diagnosis Mappings’ template which includes the most common psychiatric disorders:

Define the MSE symptoms to be analyzed

NeuroBlu stores the NLP processed MSE data as 241 structured labels which represent the presence of various symptoms, with values indicating the severity of these symptoms. Examples illustrated below:

NeuroBlu also stores a condensed version of the MSE structured labels - called RSAV - that includes only 81 labels. In this version, we can find 68 labels with possible values of 0 (if the symptom is normal / not present) or 1 (if the symptom is present).

In addition, we have 13 labels with a range of values that represent the severity of the symptoms. For example, the symptom “Affect” contains the following possible values: -2 (labile), -1 (blunted/constricted), 0 (reactive), 1 (intense) and 2 (flooded).

The configurable script that we will use in the following sections allows us to choose the categories of labels we are going to analyze. For this project, we will use labels that are related to negative symptoms: delusion, mood, hallucination, attention/concentration, affect, suicidality, insight, general and executive functioning.

Define the composite score that will be used to evaluate the MSE symptoms

It is important to define how we are going to analyze the MSE symptoms as there is currently no consensus in literatures on the methodology to analyze this type of data. For this specific study, we are interested in summarizing the presence (or absence) of specific labels or symptoms using a composite score. This score will allow us to evaluate whether there is a greater or lesser presence of negative symptoms for a given patient.

We will define the composite score at an specific time point with the following formula:

and the formula for the normalized values would be:

Let us use an example to illustrate how this score is calculated. Suppose we have a patient with different symptoms at two time points (indicated in the table below).

Calculation of the sum of the symptoms values would be:

Next, calculation of the normalized values would be:

Thus, for this specific patient, the composite score at 60 days would be:

At 180 days, the composite score would be:

As we can see, this composite score allows us to summarize the presence of negative symptoms that a patient experiences at a specified time point. It’s important to mention that in the following section, we will be using the terminology ‘proxy score’ instead of ‘composite score’.

Run the configurable script code

In NeuroBlu, we provide our users with configurable and commented scripts in both Python and R languages, so that users only have to change certain parameters before running the code to obtain the desired analysis. In this case, we need to change the following parameters before running the code:

  • cohort_name: The name of the cohort that we built in the first section of this article.
  • diagnosis_category_mapper: The name of the diagnosis category mapper that we built in the second section of this article.
  • drug_name_1 and drug_name_2: The names of the drugs to be compared. In this article we will use chlorpromazine and fluphenazine.
  • period_days_study: The period of study in days. We will calculate the proxy score at the start and at the end of the study.

After changing these parameters, we will run the code and visualize the results:

The first result that will appear is a descriptive table which contains values for the selected symptom labels, as well as demographics for both cohorts:

The second result will be a table containing the mean proxy score for both cohorts at two timepoints: at the start of the study and at the end of the study. In this exploratory analysis, we observe that at the end of the study, patients prescribed with fluphenazine tend to have a greater presence of negative symptoms compared to patients treated with chlorpromazine.

Finally, we provide the function plotPatientHistory() which can be used to create a plot showing the change in the proxy score during the study period for a specific patient. By default, the script plots the first five patients in both cohorts:


In this article, we have shown how to assess the presence of negative symptoms in a cohort of patients to evaluate the effectiveness of two drugs. While this is an exploratory analysis, it showcases the potential that NeuroBlu and our proprietary NLP algorithm have for developing high-impact and novel studies using the biggest real-world behavioral health database.


1. Kahn, R. S., Sommer, I. E., Murray, R. M., Meyer-Lindenberg, A., Weinberger, D. R., Cannon, T. D., O’Donovan, M., Correll, C. U., Kane, J. M., Van Os, J., & Insel, T. R. (2015). Schizophrenia. Nature Reviews Disease Primers 2015 1:1, 1(1), 1–23.

2. Galderisi, S., Mucci, A., Buchanan, R. W., & Arango, C. (2018). Negative symptoms of schizophrenia: new developments and unanswered research questions. The Lancet Psychiatry, 5(8), 664–677.

3. Foussias, G., Agid, O., Fervaha, G., & Remington, G. (2014). Negative symptoms of schizophrenia: Clinical features, relevance to real world functioning and specificity versus other CNS disorders. European Neuropsychopharmacology, 24(5), 693–709.

4. Correll, C. U., & Schooler, N. R. (2020). Negative Symptoms in Schizophrenia: A Review and Clinical Guide for Recognition, Assessment, and Treatment. Neuropsychiatric Disease and Treatment, 16, 519.

5. Kay, S. R., Fiszbein, A., & Opler, L. A. (1987). The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophrenia Bulletin, 13(2), 261–276.

6. Andreasen, N. C. (1989). The Scale for the Assessment of Negative Symptoms (SANS): Conceptual and Theoretical Foundations. The British Journal of Psychiatry, 155(S7), 49–52.

7. Opler, M. G. A., Yavorsky, C., & Daniel, D. G. (2017). Positive and Negative Syndrome Scale (PANSS) Training: Challenges, Solutions, and Future Directions. Innovations in Clinical Neuroscience, 14(11–12), 77.

8. Kumari, S., MPH, M., Malik, M., Florival, M. C., Manalai, M. P., MD, & MD, S. S. (2017). An Assessment of Five (PANSS, SAPS, SANS, NSA-16, CGI-SCH) commonly used Symptoms Rating Scales in Schizophrenia and Comparison to Newer Scales (CAINS, BNSS). Journal of Addiction Research & Therapy, 8(3).

9. Mukherjee, S. S., Yu, J., Won, Y., McClay, M. J., Wang, L., Rush, A. J., & Sarkar, J. (2020). Natural Language Processing-Based Quantification of the Mental State of Psychiatric Patients. Computational Psychiatry, 4(0), 76.

Back to top
Contact us