Where did the knowledge go?

What does it mean when your CME participants score worse on a post-test assessment (compared to pre-test)?

Here are some likely explanations:

  1. The data was not statistically significant.  Significance testing determines whether we reject the null hypothesis (null hypothesis = pre- and post-test scores are equivalent).  If the difference was not significant (ie, P > .05), we can’t reject this assumption.  If the pre/post response was too low to warrant statistical testing, the direction of change is meaningless – you don’t have a representative sample.
  2. Measurement bias (specifically, “multiple comparisons”).  This measurement bias results from multiple comparisons being conducted within a single sample (ie, asking dozens of pre/post questions within a single audience).  The issue with multiple comparisons is that the more questions you ask, the more likely you are to find a significant difference where it shouldn’t exist (and wouldn’t if subject to more focused assessment).  Yes, this is a bias to which many CME assessments are subject.
  3. Bad question design. Did you follow key question development guidelines?  If not, the post-activity knowledge drop could be due to misinterpretation of the question.  You can learn more about question design principles here.

CME Outcomes Statistician, first grade

I was very excited to have my CMEPalooza session (Secrets of CME Outcome Assessment) officially sanctioned by the League of Assessors (LoA).  Accordingly, participants who passed the associated examination were awarded “CME Outcome Statistician, first grade” certifications.  It’s a grueling test, but three candidates made it through and received their certifications today (names withheld due to exclusivity).


More good news…I petitioned the LoA to extend the qualifying exam for another six weeks (expiring May 29, 2015) and was officially approved!  So you can still view the CMEPalooza session (here) and then take the qualifying exam (sorry, exam is now closed). Good luck!

Physician self-assessment questions

Let’s officially retire this pre/post-activity question:

<pre-activity> How would you rate your knowledge of X? (or the common variant: How confident are you in your ability to do X?)

<post-activity> After having participated in this activity, how would you rate your knowledge of X?  (or …how confident are you now in your ability to do X?)

First and foremost, it’s really lazy.  Second, we’ve known for long enough that physician self-assessments are reliably unreliable (Davis et al, 2006).   It’s better to ask no question, than a bad one.


Be patient on those outcomes

Oh, I so want to say I measure patient outcomes.  Everyone gets so excited.  Imagine these two presentation titles: 1) “Reliability and Validity in Educational Outcome Assessment” and 2) “Measuring Patient Outcomes Associated with CME Participation”.  Which one are you going to attend?  Well…yes, to most folks those both sound pretty boring.  But this is a CME blog.  And in this part of town, it’d be like asking whether you’d rather hang out with some guy who runs a strip mall accounting firm or Will Ferrell.

But I’m not Will Ferrell.  And instead of an accountant, I’d like to introduce you to Drs. Cook and West who present a very clear and thoughtful piece on  why Will Ferrell really isn’t that funny why patient outcomes may not be the best  CME outcome target (click here for the article).

Read this article and be prepared.  If you’re presenting on patient outcomes, I’m going to ask about things like “dilution” and “teaching-to-the-test”.  Unless, of course, you are Will Ferrell.  In which case, thank you for Elf.


Recipe for CME

How do you cook CME?  Maybe simmer KOL in a venue sauce and add enduring material to taste?  And how do you select your ingredients?  Are you a student of food theory or do you just feel your way through?

Well, I’m supposed to be scientifically-minded, so my pantry is full of evidence-based options.  Wait…did I say full?  I meant I know these four things:

  1. Live activities are more savory than print
  2. You’ll make a better soup with multi-media
  3. Multiple tastes are preferred to just one
  4. Case-based discussions are the most important seasoning

According to Marinopolous SS, et al. that’s all we’ve got to work with.  When you don’t know who’s coming to dinner, how hungry they are, or any of their possible dietary restrictions, you’ve got to make CME magic using only these four things. That’s pretty bleak.

Why don’t we know more?  Too few studies with no standardization and very little reliability or validity data to support findings.  Us outcome experts may all be wearing toques, but apparently only make french fries.


Commitment to change: good night, sweet prince?

Commitment to change (CTC) questions are the caboose of every post-activity CME evaluation – stripped of all relevancy and sustained solely by nostalgia. Thirty years since its introduction, we can now all retire this method, confident that it has served us well, but that it’s now time for something more…app-ish.  And off it goes, grumbling it’s final words toward obscurity: “…but, you never really knew me”.

Before you dismiss CTC, check out this article.   People have been studying CTC for a long time.  And there’s value to this approach – assuming you use it correctly.  Should you use a follow-up survey?  When?  How?  How should you word the questions?  Include a rating scale?  And how should you sort through and interpret the results?  This stuff all matters. And you won’t find an easier to digest summary than this 2010 article in Evaluation & the Health Professions.

So, yes, if you’re simply maintaining a “what are you going to change in your practice” question at the end of every CME evaluation – definitely send that packing.  Then read the aforementioned article.  You’ll find that CTC has limitations, but when done in accordance with the latest evidence, there’s a lot of good data to be had.

Thoughts on organizing your outcomes data

An experiment begins with a hypothesis. For example…I suspect that the next person to enter this coffee shop will be a hipster (denied, by the way).

A neat and tidy hypothesis for CME outcome assessment might read: I suspect that participants in this CME activity will increase compliance with <insert evidence-based quality indicator here>.

Unfortunately, access to data that would answer such a question is beyond the reach of most CME providers. So we use proxy measures such as knowledge tests or case vignette surveys through which we hope to show data suggestive of CME participants increasing their compliance with <insert evidence-based quality indicator here>.

Although this data is much easier to access, it can be pretty tedious to weed through. Issue #1: How do you reduce the data across multiple knowledge or case vignette questions into a single statement about CME effectiveness? Issue #2: How do you systematically organize the outcomes data to develop specific recommendations for future CME?

For issue #1, I’d recommend using “effect size”. There’s more about that here.

For issue #2, consider organizing your outcome results into the following four buckets (of note, there is some overlap between these buckets):

1. Unconfirmed gap – pre-activity question data suggests knowledge or competence already high (typically defined as >70% of respondents identifying the evidence-based correct answer OR agreeing on a single answer if there is no correct response). Important note: although we shouldn’t expect every anticipated gap to be present in our CME participants, one cause of an unconfirmed gap (other than a bad needs assessment) is the use of assessment questions that are too easy and/or don’t align with the education.

2. Confirmed gap – pre-activity questions data suggest that knowledge or competence is sufficiently low to warrant educational focus (typically defined as <70% of respondents identifying the evidence-based correct answer OR agreeing on a single answer if there is no correct response)

3. Residual gap

a. Post-activity data only = typically defined as <70% of respondents identifying the evidence-based correct answer OR agreeing on a single answer if there is no evidence-based correct response

b. Pre- vs. post-activity data = no significant difference between pre- and post-activity responses

4. Gap addressed

a. Post-activity data only = typically defined as >70% of respondents identifying the evidence-based correct answer OR agreeing on a single answer if there is no correct response

b. Pre- vs. post-activity data = significant difference between pre- and post-activity responses

Most important to note, if the outcome assessment questions do not accurately reflect gaps identified in the needs assessment, the results of the final report are not going to make any sense (no matter how you organize the results).

