I took an entire semester on structural equation modeling (SEM), and I still only have a rather fuzzy idea of what it is. In short, this is the best technical definition I’ve arrived at.

## A Brief Technical Definition of SEM

Structural equation modeling is a research method that combines factor analysis and path analysis to mathematically describe complex conceptual mechanisms in a medium size dataset.

Factor analysis measures latent traits indirectly from measureable variables by either exploring new models, or confirming hypothesized ones.

Path analysis is linear regression repeated where one outcome, or dependent variable, becomes the exposure, or independent variable, for the subsequent regression.

That may make some semblance of sense if you’re familiar with statistics and psychometrics, the field of using surveys to measure educational and psychological abilities. So what is SEM to normal, decent, God-fearing people?

## A Conceptual Definition of SEM

Structural equation modeling is a loosly defined quantitative method of building and then testing a mechanistic model. If a researcher understands a concept, they can then propose a model, where one step affects another, which may then inhibit a third step, and so on. Once a model of sequential steps is described, it can be tested against similar models by various statistical methods. The inferior mechanistic model is discarded in an iteraive process as the researcher strives ever nearer to the truth.

Honestly I don’t think the above description is elegant enough to explain SEM to my grandmother. But it should be enough to communicate with other students and researchers.

## What do SEM Models Look Like?

Okay, don’t lose me here. Just relax and kind of let your eyes out of focus. Don’t look at the details.

Whew, look away!

This is a google image of an SEM diagram. I’ve no idea what it’s trying to explain. And that’s kind of the fun of SEM. You look really smart, but you’ll probably have a rough time explaining your theoretical mechanism to anyone.

These are mechanisms, and they look like spiderwebs of circles and rectangles with arrows going everywhere. There’s a technical language to the SEM model diagrams that you can learn from some of the links at the bottom. But for now, just know how to spot an SEM model diagram in the wild.

## An Application of SEM: Survey Data

SEM is a method that has a couple really great applications and many terrible ones. It’s great if you have a data set that is on the order of magnitude of 100 to 1,000 observations. The variables in the model should also be continuous, as dichotomous and integer variables lead to mathematical problems. For this reason, SEM is more common in psychology and the humanities rather than medicine or epidemiology…at least for now. In particular, SEM tends to be applied to survey data.

Survey data takes a bunch of items, adds them up in some fashion, and produces a continuous score. We all know this from elementary school, where if you got 79% on a quiz and your friend got 80%, then she was smarter than you and that was that.

While this might serve some purpose in school, it’s not great when assessing whether a psychiatric patient is competent to make decisions with their physician on managing their chronic kidney disease. So instead we ask the patient a series of multiple-choice questions, score it according to a research-validated method, and then use clinical expertise and guidelines to decide if the patient really is capable of refusing medical advice. SEM may have been used by the team of researchers who validated that survey, which is now used by the psychiatrist in the emergency room…in a utopian, ideal world at least.

In the real world, clinical data is often not continuous, and SEM doesn’t have a great fit. Genetics research may already have a few good applications. But many fields do not. Hopefully, the potential uses of SEM are broadening as we get better at measuring every possible variable with greater and greater precision.

Imagine a world where we know exactly when the last dose of insulin was given for an entire population of diabetics. Then we could use the variable “minutes since last insulin dose was given”, instead of saying “a couple of hours ago” or “yesterday”. This information may be a useful intermediate step in an SEM model that attempts to characterize how processed food and alcohol exposure affects hypoglycemic seizures in impoverished diabetic patients.

Better yet, imagine if the researchers knew exactly what these patients ate, how much of it, and how often? Creepy to many of us, but this sort of thing would get a dietary epidemiologist more riled up than Guy Fieri at that one pizza place in that one city.

Thank you for finishing my graduate student-level explanation of SEM. I find it’s easy to sit in class listening to SEM lecture after SEM lecture and still not be able to explain what I was just told. Hopefully you can now decide whether it’s worth pursuing further.

Below are several superior and much more thorough explanations of SEM.

## Watch the Professionals Explain SEM

If you want the most accessible Structural Equation Modeling lectures, then check out Structural Equation Modelling (SEM): What it is and what it isn’t by Patrick Sturgis from the National Center for Research Methods. I supplemented my course work with this mini-series, and it was a great addition. Maybe the British accent is the right proper way to hear someone explain complex statistical methods.

A technical salesman explains how to use SEM in Stata, which I found was the most useful way to conduct SEM in my class. Unfortunately, Stata costs more than my travel grant award, so it might not be useful to those without institutional access.

Fortunately, there’s a free R package for SEM called lavaan, which has a ton of documentation for those dedicated few to peruse through.

Statistics Solutions has a 1-hour lecture on SEM, which might be great for a deep dive of SEM.

Good luck.

How psychometrics and medicine have trouble communicating

“If the reader is to grasp what the writer means, the writer must understand what the reader needs.

“Science is often hard to read. Most people assume that its difficulties are born out of necessity, out of the extreme complexity of scientific concepts, data and analysis. We argue here that complexity of thought need not lead to impenetrability of expression; we demonstrate a number of rhetorical principles that can produce clarity in communication without oversimplifying scientific issues. The results are substantive, not merely cosmetic: Improving the quality of writing actually improves the quality of thought.”

#### Source

American Scientist: The Science of Scientific Writing by George GopenJudith Swan

“This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applications. It enables computational systems to adaptively improve their performance with experience accumulated from the observed data. ML has become one of the hottest fields of study today, taken up by undergraduate and graduate students from 15 different majors at Caltech. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects. The lectures below follow each other in a story-like fashion:

• What is learning?
• Can a machine learn?
• How to do it?
• How to do it well?
• Take-home lessons.

“The 18 lectures are about 60 minutes each plus Q&A.”

#### Source

Caltech: Learning From Data Machine Learning Course by Yaser S. Abu-Mostafa

Textbook: Learning From Data by Yaser S. Abu-Mostafa and Malik Magdon-Ismail

Centering continuous independent variables was one of the earliest lessons in my linear regression class. I was recently asked to explain, “what’s the point?” of going through the trouble of centering? I was at a loss, and realized I had been assuming the answer was obvious when it was not.

After a quick google, this article explained the answer well. In short, centering is useful when interpreting the intercept is important. Here example of age of development of language in infants. Her original article has been copied below.

## Should You Always Center a Predictor on the Mean?

by Karen Grace-Martin

Centering predictor variables is one of those simple but extremely useful practices that is easily overlooked.

It’s almost too simple.

Centering simply means subtracting a constant from every value of a variable. What it does is redefine the 0 point for that predictor to be whatever value you subtracted. It shifts the scale over, but retains the units.

The effect is that the slope between that predictor and the response variable doesn’t change at all. But the interpretation of the intercept does.

The intercept is just the mean of the response when all predictors = 0. So when 0 is out of the range of data, that value is meaningless. But when you center X so that a value within the dataset becomes 0, the intercept becomes the mean of Y at the value you centered on.

What’s the point? Who cares about interpreting the intercept?

It’s true. In many models, you’re not really interested in the intercept. In those models, there isn’t really a point, so don’t worry about it.

But, and there’s always a but, in many models interpreting the intercept becomes really, really important. So whether and where you center becomes important too.

A few examples include models with a dummy-coded predictor, models with a polynomial (curvature) term, and random slope models.

Let’s look more closely at one of these examples.

In models with a dummy-coded predictor, the intercept is the mean of Y for the reference category—the category numbered 0. If there’s also a continuous predictor in the model, X2, that intercept is the mean of Y for the reference category only when X2=0.

If 0 is a meaningful value for X2 and within the data set, then there’s no reason to center. But if neither is true, centering will help you interpret the intercept.

For example, let’s say you’re doing a study on language development in infants. X1, the dummy-coded categorical predictor, is whether the child is bilingual (X1=1) or monolingual (X1=0). X2 is the age in months when the child spoke their first word, and Y is the number of words in their vocabulary for their primary language at 24 months.

If we don’t center X2, the intercept in this model will be the mean number of words in the vocabulary of monolingual children who uttered their first word at birth (X2=0).

And since infants never speak at birth, it’s meaningless.

A better approach is to center age at some value that is actually in the range of the data. One option, often a good one, is to use the mean age of first spoken word of all children in the data set.

This would make the intercept the mean number of words in the vocabulary of monolingual children for those children who uttered their first word at the mean age that all children uttered their first word.

One problem is that the mean age at which infants utter their first word may differ from one sample to another. This means you’re not always evaluating that mean that the exact same age. It’s not comparable across samples.

So another option is to choose a meaningful value of age that is within the values in the data set. One example may be at 12 months.

Under this option the interpretation of the intercept is the mean number of words in the vocabulary of monolingual children for those children who uttered their first word at 12 months.

The exact value you center on doesn’t matter as long it’s meaningful, holds the same meaning across samples, and within the range of data. You may find that choosing the lowest value or the highest value of age is the best option. It’s up to you to decide the age at which it’s most meaningful to interpret the intercept.

#### Source

The Analysis Factor: Should You Always Center a Predictor on the Mean? by Karen Grace-Martin

“How not to collaborate with a biostatistician. This is what happens when two people are speaking different research languages! My current workplace is nothing like this, but I think most biostatisticians have had some kind of similar experiences like this in the past!”

#### Source

YouTube: Biostatistics vs. Lab Research by JavaMama926

This site has reporting guidelines for all types of studies. These are checklists for writing all parts of a paper on these various study types.

CONSORT is for clinical trials and STROBE is for observational studies.

#### Source

Equator Network: Reporting guidelines for main study types

Covariance and correlation are two statistical concepts that are closely related, both conceptually and by their name. The excerpts below are from a concise article that differentiates them.

## Difference Between Covariance and Correlation

“Correlation is a special case of covariance which can be obtained when the data is standardised. Now, when it comes to making a choice, which is a better measure of the relationship between two variables, correlation is preferred over covariance, because it remains unaffected by the change in location and scale, and can also be used to make a comparison between two pairs of variables.”

## Key Differences Between Covariance and Correlation

“The following points are noteworthy so far as the difference between covariance and correlation is concerned:

1. “A measure used to indicate the extent to which two random variables change in tandem is known as covariance. A measure used to represent how strongly two random variables are related known as correlation.
2. “Covariance is nothing but a measure of correlation. On the contrary, correlation refers to the scaled form of covariance.
3. “The value of correlation takes place between -1 and +1. Conversely, the value of covariance lies between -∞ and +∞.
4. “Covariance is affected by the change in scale, i.e. if all the value of one variable is multiplied by a constant and all the value of another variable are multiplied, by a similar or different constant, then the covariance is changed. As against this, correlation is not influenced by the change in scale.
5. “Correlation is dimensionless, i.e. it is a unit-free measure of the relationship between variables. Unlike covariance, where the value is obtained by the product of the units of the two variables.”

#### Source

Before forecasting its future, there is likely merit in explaining what physiatry is.

# What is Physiatry/PM&R?

“Physiatry, also known as pain management and rehabilitation (PM&R) is a branch of medicine that aims to enhance and restore functional ability and quality of life to those with physical impairments or disabilities.”

– Wikipedia

Medicine tends to be sterile. When the end goal is reduced to keeping the patient alive, or moving a lab value within its appropriate range, doctors can forget to be human. Resources are stretched thin in a county hospital in a large metropolitan area. Providers are only attending to biological issues because they are prioritizing scarce resources. They are not trained to deal with the more pressing, social issues. What is the point in treating a patient’s asthma, discharging them back to the streets, waiting for another exacerbation, and then rounding on them next week after they’ve been admitted once again from the ER? Patients like these need social work and preventive public health measures. Instead they get expensive medications once every few months. Medical care can feel calloused and at times even cruel.

On the other end of the spectrum, there are fields based entirely in human connection, but they lack teeth. Naturopathy and alternative medicines have either been shown to have no efficacy, or there is no incentive to research them because they are assumed to be entirely based on the placebo effect. Sometimes people need to be heard and feel connected with their provider. These fields take advantage of this, and patients may be happier, albeit less healthy.

PM&R is western medicine that focuses on the patient’s function and quality of life. Chronic pain patients have sifted through the medical system, frustrated by the lack of resolution to their pain. They’ve stumped doctors who cannot do anything for them because all of their lab values are correct and they seem healthy enough by the protocol standards. Surgeons will happily perform surgery, but it seems a drastic move exposing patients to serious risks, which can be minimized or ignored during the pre-op. Surgery may be the best option for some, but certainly not all of these patients. Physiatry can offer medical treatment, alongside physical therapy for a multidisciplinary approach to increasing patient health and quality of life.

But it turns out, not many procedures used by physiatrists have been supported by clinical evidence.

# Dr. Braddom Predicts the Future of PM&R

Dr. Randall Braddom is a clinical professor of physiatry. While at Rutgers Medical School in 2014, he gave a 100-Slide PowerPoint presentation that concisely summarizes the field of PM&R and it’s future direction from his perspective. Dr. Braddom acknowledges how worthless predictions of the future often are, but an experienced physician creating a deep portrayal of their specialty is worth far more than SDN forums.

According to Dr. Braddom, the field of physiatry is placing more value on research. One of the reasons is that physiatric procedures have not been validated in randomized clinical trials, and insurance companies are eliminating reimbursements for procedures without scientific evidence supporting their efficacy. The large proportion of physiatrist in clinical practice may see large reductions in their financial reimbursement for some of their procedures, such as sacroiliac and Z-joint (zygapophysial) injections.  A whole field of doctors potentially not getting paid for their work may be a powerful force. It seems that clinical research opportunities in PM&R will likely thrive in the near future.

Below are select few slides from Dr. Braddom’s presentation.

# PM&R Research will Boom Soon

## Trend to Evidence Based Medicine

Evidence Basis of PM&R is Significantly limited due to:

• Variability/complexity
• Limited research
• Distance from molecular biology
• Clinical studies lack analytical rigor

## Research is Critical for PM&R Practice

• Outcome Studies are key to practice survival
• Randomized controlled trials (RCT’s)
• Almost no other kind of research is taken seriously
• Uncontrolled research is only a pilot study, at best
• Laboratory moving closer to the bedside
• New emphasis on Evidence Based Medicine in Health Care Reform

## Few Physiatrists Have Become High Quality Researchers

#### Why?

• Length of training required
• Debt level problem
• Perceived decrease in research funding
• Instability in research funding
• Monetary rewards of clinical practice
• Physiatric personality: people oriented rather than rat oriented

“It has also been generally agreed that Rehabilitation research has not done well in fulfilling its objective of providing a foundation of knowledge for rehabilitation practice.”

Lieberman (1993)

## AAPMR LNA: 2004 Physiatric Effort Report

• Outpatient 50%
• Inpatient 23%
• Teaching/CME 4%
• Research 3%
• Miscellaneous 10%

# What to Do After Residency

## 2014 ABPMR Subspecialty Exams for Physiatrists

• Sports Medicine
• Neuromuscular Medicine
• Pain Medicine
• Hospice and Palliative Medicine
• Pediatric Rehabilitation Medicine
• Spinal Cord Injury Medicine
• Brain Injury Medicine

## What Percentage of Residents Join Orthopedic Groups?

• 22%
• Range from 10-40%

On residents joining Orthopedic groups:

“This is a sin against humanity!”

– PM&R Chair

From reading forums, it sounds like being a physiatrists working in an orthopedic practice may be a horrible experience. Surgeons with large personalities shunt all their conservative preventive care to one physiatrist on the team because it is a waste of their time to do injections when there are more challenging surgeries to be performed.

I personally would not want to spend so much time in training to be looked down upon or taken advantage of financially during my day to day practice. I don’t see the allure to working in orthopedic groups that the 22% of survey respondents said they are doing.

# PM&R is a Great Field

• Patients appreciate what we do
• Not limited by an organ
• Jobs of all types available
• Population demographics favor us
• Good balance of procedure/E&M
• Good physiatric profile/nice people
• Small (10,000)

PM&R is focused on patient outcomes and quality of life. There are a wide variety of procedures, subspecialties, and practice styles within physiatry. Dr. Braddom presented many trends in the field as of 2014, and where he expects it to head in the future. He underscores the growing research opportunities in PM&R, the breadth of fellowships for sub-specialization, and that working in an orthopedic group may be less than ideal. Regardless of his prophecies, as of now physiatry looks like a promising career path.

#### Source

Wikipedia: Physiatry

The Atlantic: The Problem With Satisfied Patients

Doctor Voices: What is PM&R?

Doximity: Randall L. Braddom, M.D., M.S.

The California Society of Physical Medicine and Rehabilitation: The Future of PM&R From a PGY-46