Author Topic: A short Dissertation On Quality control By Steve Dimse  (Read 4010 times)

0 Members and 1 Guest are viewing this topic.

Offline Beaudog

  • Forecaster
  • *****
  • Posts: 1206
A short Dissertation On Quality control By Steve Dimse
« on: March 06, 2014, 10:20:18 AM »

 Just for the record, I had this problem a few days ago as well. QC flagged my dewpoint, even though my station was fine. One day of errors, and then everything went back to normal.

I'd like to try again to convince people that chasing QA is not worthwhile. Most of you probably don't know my life history, but my first career was as a professor of emergency medicine for the University of Miami. All profs have an area where they spend a little extra time to become the local expert of something. With my interest in computers and math I naturally became the UM SoM expert in quantitative analysis and critical analysis. The skill doesn't precisely carry over to meteorology, but the principles do.

The first thing I want to share is how we look at lab tests. If you test some parameter in a group of normal humans, say sodium, you will get a very wide range of values even if your equipment works perfectly. So we rather arbitrarily assume that the values will follow a bell shaped distribution (they don't) and draw the line for normal such that 95% of the healthy populations fits in there. This is really important, because it means that fully 5% of healthy people have abnormal results for this and any other test. Furthermore if you run a panel of 20 lab tests like we love to do nowadays, on average every healthy person will have one abnormal value. This is a big reason why docs don't want you to have your lab results - it takes us forever to convince people that lab values marked abnormal by the lab are actually normal.

This transfers over to CWOP QC only in the general sense, but there is something to be learned. If we expected to have perfect equipment, and if there was a way the QC computer could know exactly what the weather actually was in your location, then a bad QC means there is a problem. But clearly this is not reality.

QC is calculated from surrounding weather data. You might have an acceptable error in one direction, but if everyone around you randomly has an acceptable error in the other direction you will get flagged. But no one has a problem to fix or to fret over. And it is not surprising that humidity is a place where this occurs because it is the hardest of the weather parameters to measure. In medicine we call this a false positive. People die from false positives, when a false positive test value is followed up by an invasive test that causes a complication. That is why we train young doctors to be smart about how they look at data.

The medicine analogy only stretches so far. Now imagine the world was perfect and there were no false positives. The QC computer does know exactly what the weather is at your location, and compares it to your perfectly functioning weather station and always gets zero difference. We don't need CWOP then. The computer already has the answers and we are wasting our time.

We want your weather data precisely because it differs from what the QC predicts. We know the QC computer is not perfect. We know that every sensor, even professional quality sensors, will not be perfect. The more data we get from real world flawed sensors the easier it becomes to extract actual values. THAT is what CWOP is about.

Maybe you recall the late winter 2007 incident in Baltimore Harbor. A tourist pontoon boat flipped in a freak wind gust on a beautiful blue sky day killing five. CWOP had fewer stations then, but we caught a signal from four of this event. It started about 30 minutes beforehand with a long slight increase in wind speed, the closer stations showed the peak getting higher and narrower, just like a wave cresting on a beach. We looked at the data to see if there might be something there that could have provided warning, but the signal was too weak. However, the signal would have tripped the QC, because this wind was not expected or seen by other stations. So this interesting event was not an instrumentation failure, even if the QC failed!

QC is there to look for systemic errors. The saturated humidity sensor that reads 20% high. The exposed temperature sensor that warms up 20 degrees between 2 and 4 pm when the sun hits it. The rainfall gauge that never records rain even though everyone around is in a monsoon. Those are the sorts of things that you need to be checking your QC for. But the occasional value that falls a little outside the arbitrarily set QC cutoff and then returns is not a problem. It is just the normal sort of thing that pops up when you gather an analyze huge amounts of data!

Steve K4HG

Offline George Richardson

  • WxElement panel
  • Forecaster
  • *****
  • Posts: 1391
    • Smith Mountain Lake Weather
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #1 on: March 06, 2014, 10:53:48 AM »
Heaven Help Us!  There is Some sanity in this world!  =D>

George

Offline Wtronics

  • Senior Member
  • **
  • Posts: 51
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #2 on: March 06, 2014, 02:40:04 PM »
This statement by Steve Dimse on QA/CWOP, and variations in the bell curve as well as outliers of the bell curve, is the best summation of statistical data analysis I have seen - especially as it applies to weather.

We all have differences with CWOP at times and we even have striking differences with neighboring stations at times. The trick is to know when we are statistically (consistently) out of line and when we are a simply a bit off the bell curve in an isolated event.

As an example Steve's statement is especially true with humidity. I have studied humidity sensors and most manufactures try to make a humidity sensor a linear device and it is not. It is non-linear at both ends.  So as a humidity sensor reading approaches 95% it goes into a (non-linear) curve. And at extreme ends of most sensors there are even measurable variations from sensor to sensor. It is just to complex to to write code to compensate for the last 5% of humidity (the equation would be at least 3rd order) so we just step outside, understand that the humidity is high, and move on.

The "short Dissertation On Quality control By Steve Dimse" should be the first consideration whenever we find data "out of line".

Most people here do consider data variations carefully and the real problem is when do we decide the sensor is producing erroneous data?

Thank you Steve!!!

Offline Old Tele man

  • Singing in the rain...
  • Forecaster
  • *****
  • Posts: 1035
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #3 on: December 13, 2014, 01:58:25 PM »
My CLIFF NOTES explanation is that NWS uses our CWOP data to *teach* their weather prediction software, so we (CWOP) are (literally) Guinea Pig exercises for their larger goal of unified weather prediction. So:

Garbage In, Garbage Out.
Mixed data In, mixed predictions Out
Skewed data In, skewed predictions Out
Good data In, eventually good data Out
 
SYS: Davis VP2/WL-IP & Envoy8X/WL-USB
CWOP: DW6988 - 2 miles NNE of Cortaro, AZ
WU - KAZTUCSO202, Countryside

Offline W3DRM

  • Forecaster
  • *****
  • Posts: 3209
    • Carson Valley Weather
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #4 on: December 13, 2014, 02:38:53 PM »
Steve, great explanation =D>

Statistics and analysis of weather data is great. You state "QC is there to look for systemic errors". However, one of the biggest issues I have (and have had for quite some time now) are the CWOP Barometer QC charts and resulting analysis they give. When I first started submitting to and receiving CWOP QC data, my barometric charts almost always showed the green two-thumbs up indicators. Sometime later, perhaps a year or so ago, there was a very apparent change in either the algorithms used for the QC analysis or the inclusion of other "questionable" barometric instrument inputs. The charts literally went from what would be considered normal barometer readings to charts with rapidly changing barometer readings. Those changes were, and still are, totally unrealistic due to the nature of how atmospheric pressure changes occur. Simply put, the charts were reporting huge changes that couldn't be true. The folks at CWOP, for whatever reason, have defended those changes and refuse to extract/reject the datum that is causing these fluctuations. The result of this has been that many of us have simply given up trying to keep our equipment calibrated as well as totally ignoring most of the CWOP QC reports we see. There are many posts elsewhere in this forum regarding this subject.

It is my not-so-humble opinion, presenting data that is obviously flawed is worse that having no data at all. For someone new installing their first weather station, they have no reliable means of calibrating their equipment via the CWOP system. CWOP used to be a great tool for all of us to use but not any longer as it cannot be trusted to provide valid data.

I get many emails from folks who view my weather website. Most of them ask why I don't fix my barometer. Thus, I am very close to simply removing any references to CWOP from my website.
Don - W3DRM - Minden, Nevada --- Blitzortung ID: 808 --- FlightRadar24 ID: F-KRNO2
Davis Wireless VP2, WD 10.37S-b53,
StartWatch, VirtualVP, VPLive, , Win10 Pro
--- Logitech HD Pro C920 webcam
--- RIPE Atlas Probe - 32849

Offline Old Tele man

  • Singing in the rain...
  • Forecaster
  • *****
  • Posts: 1035
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #5 on: December 13, 2014, 03:07:20 PM »
Mark Twain: "...the BEST government money can BUY...", so apparently NOAA is far down on the money-funding list (wink,wink).
« Last Edit: June 26, 2016, 02:47:55 PM by Old Tele man »
SYS: Davis VP2/WL-IP & Envoy8X/WL-USB
CWOP: DW6988 - 2 miles NNE of Cortaro, AZ
WU - KAZTUCSO202, Countryside

Offline WheatonRon

  • Forecaster
  • *****
  • Posts: 754
    • WUnderground
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #6 on: September 23, 2016, 09:02:42 PM »

 Just for the record, I had this problem a few days ago as well. QC flagged my dewpoint, even though my station was fine. One day of errors, and then everything went back to normal.

I'd like to try again to convince people that chasing QA is not worthwhile. Most of you probably don't know my life history, but my first career was as a professor of emergency medicine for the University of Miami. All profs have an area where they spend a little extra time to become the local expert of something. With my interest in computers and math I naturally became the UM SoM expert in quantitative analysis and critical analysis. The skill doesn't precisely carry over to meteorology, but the principles do.

The first thing I want to share is how we look at lab tests. If you test some parameter in a group of normal humans, say sodium, you will get a very wide range of values even if your equipment works perfectly. So we rather arbitrarily assume that the values will follow a bell shaped distribution (they don't) and draw the line for normal such that 95% of the healthy populations fits in there. This is really important, because it means that fully 5% of healthy people have abnormal results for this and any other test. Furthermore if you run a panel of 20 lab tests like we love to do nowadays, on average every healthy person will have one abnormal value. This is a big reason why docs don't want you to have your lab results - it takes us forever to convince people that lab values marked abnormal by the lab are actually normal.

This transfers over to CWOP QC only in the general sense, but there is something to be learned. If we expected to have perfect equipment, and if there was a way the QC computer could know exactly what the weather actually was in your location, then a bad QC means there is a problem. But clearly this is not reality.

QC is calculated from surrounding weather data. You might have an acceptable error in one direction, but if everyone around you randomly has an acceptable error in the other direction you will get flagged. But no one has a problem to fix or to fret over. And it is not surprising that humidity is a place where this occurs because it is the hardest of the weather parameters to measure. In medicine we call this a false positive. People die from false positives, when a false positive test value is followed up by an invasive test that causes a complication. That is why we train young doctors to be smart about how they look at data.

The medicine analogy only stretches so far. Now imagine the world was perfect and there were no false positives. The QC computer does know exactly what the weather is at your location, and compares it to your perfectly functioning weather station and always gets zero difference. We don't need CWOP then. The computer already has the answers and we are wasting our time.

We want your weather data precisely because it differs from what the QC predicts. We know the QC computer is not perfect. We know that every sensor, even professional quality sensors, will not be perfect. The more data we get from real world flawed sensors the easier it becomes to extract actual values. THAT is what CWOP is about.

Maybe you recall the late winter 2007 incident in Baltimore Harbor. A tourist pontoon boat flipped in a freak wind gust on a beautiful blue sky day killing five. CWOP had fewer stations then, but we caught a signal from four of this event. It started about 30 minutes beforehand with a long slight increase in wind speed, the closer stations showed the peak getting higher and narrower, just like a wave cresting on a beach. We looked at the data to see if there might be something there that could have provided warning, but the signal was too weak. However, the signal would have tripped the QC, because this wind was not expected or seen by other stations. So this interesting event was not an instrumentation failure, even if the QC failed!

QC is there to look for systemic errors. The saturated humidity sensor that reads 20% high. The exposed temperature sensor that warms up 20 degrees between 2 and 4 pm when the sun hits it. The rainfall gauge that never records rain even though everyone around is in a monsoon. Those are the sorts of things that you need to be checking your QC for. But the occasional value that falls a little outside the arbitrarily set QC cutoff and then returns is not a problem. It is just the normal sort of thing that pops up when you gather an analyze huge amounts of data!

Steve K4HG

This discussion is nearly two years old. Steve, anything change?
« Last Edit: November 07, 2016, 09:23:03 PM by WheatonRon »
Davis VP2 with Daytime FARS, SHT31 (2 complete systems-1 for uploading to the internet, the other system for test and play); Rainwise 111; CWOP--CW5020; WU--KILWHEAT17; CoCoRaHS--IL-DP-132

Offline WheatonRon

  • Forecaster
  • *****
  • Posts: 754
    • WUnderground
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #7 on: November 07, 2016, 08:44:46 PM »

 Just for the record, I had this problem a few days ago as well. QC flagged my dewpoint, even though my station was fine. One day of errors, and then everything went back to normal.

...

QC is there to look for systemic errors. The saturated humidity sensor that reads 20% high. The exposed temperature sensor that warms up 20 degrees between 2 and 4 pm when the sun hits it. The rainfall gauge that never records rain even though everyone around is in a monsoon. Those are the sorts of things that you need to be checking your QC for. But the occasional value that falls a little outside the arbitrarily set QC cutoff and then returns is not a problem. It is just the normal sort of thing that pops up when you gather an analyze huge amounts of data!

Steve K4HG

This discussion is nearly two years old. Steve, anything change?

I gather nothing has changed or Steve has not seen this post. Related to quality, there is a nearby PWS by me (not too far from Midway airport in Chicago) that is reporting hugely incorrect temperatures--over 100 degrees since November 1. Global warming is really here! Seriously, I have contacted the PWS owner and have tried to help him but he is slow to respond. His station is DW7813 in the CWOP network and on Mesowest  http://mesowest.utah.edu/cgi-bin/droman/meso_base.cgi?stn=d7813

Yes, both CWOP and Mesowest confirm his data is bad, but it still gets posted on the National Weather Service site. This is where common sense should prevail and CWOP or the NWS should pull his station (at least his temperature reading) from its website--makes a mockery out of all of us!


« Last Edit: November 08, 2016, 08:36:02 AM by WheatonRon »
Davis VP2 with Daytime FARS, SHT31 (2 complete systems-1 for uploading to the internet, the other system for test and play); Rainwise 111; CWOP--CW5020; WU--KILWHEAT17; CoCoRaHS--IL-DP-132

Online ValentineWeather

  • Forecaster
  • *****
  • Posts: 4099
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #8 on: November 07, 2016, 09:18:03 PM »
Humm! Only 128 F with a DP of 128...Heat index 476 F / 247 C.
Randy, the Aviator is my father in 1963 with his Indian bike

Offline WheatonRon

  • Forecaster
  • *****
  • Posts: 754
    • WUnderground
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #9 on: November 07, 2016, 09:27:05 PM »
Humm! Only 128 F with a DP of 128...Heat index 476 F / 247 C.

Wow!! That is warm and muggy but CWOP and the NWS are posting that data on their respective websites!  Great news!
« Last Edit: November 07, 2016, 11:22:16 PM by WheatonRon »
Davis VP2 with Daytime FARS, SHT31 (2 complete systems-1 for uploading to the internet, the other system for test and play); Rainwise 111; CWOP--CW5020; WU--KILWHEAT17; CoCoRaHS--IL-DP-132

Offline C5250

  • Forecaster
  • *****
  • Posts: 840
    • Local weather
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #10 on: November 07, 2016, 11:42:06 PM »
Humm! Only 128 F with a DP of 128...Heat index 476 F / 247 C.

Wow!! That is warm and muggy but CWOP and the NWS are posting that data on their respective websites!  Great news!

While this would be a clear example of something is obviously wrong; would you want CWOP to stop reporting your data because it didn't pass QC?
Precious little in your life is yours by right and won without a fight.

Offline WheatonRon

  • Forecaster
  • *****
  • Posts: 754
    • WUnderground
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #11 on: November 07, 2016, 11:48:31 PM »
Humm! Only 128 F with a DP of 128...Heat index 476 F / 247 C.

Wow!! That is warm and muggy but CWOP and the NWS are posting that data on their respective websites!  Great news!

While this would be a clear example of something is obviously wrong; would you want CWOP to stop reporting your data because it didn't pass QC?

This is a difficult question to answer. When data is wrong, so bad, like this the data coming from this PWS, it should not be published on any Internet site.  I understand the difficulty in finding a line in the sand, but clearly this station has crossed it.
« Last Edit: November 08, 2016, 08:38:04 AM by WheatonRon »
Davis VP2 with Daytime FARS, SHT31 (2 complete systems-1 for uploading to the internet, the other system for test and play); Rainwise 111; CWOP--CW5020; WU--KILWHEAT17; CoCoRaHS--IL-DP-132

Online ValentineWeather

  • Forecaster
  • *****
  • Posts: 4099
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #12 on: November 08, 2016, 12:39:38 PM »
As long as it was bad and not being flagged as bad because of someone else and is chronically reporting bad data I would want it removed from the data stream. Not a tough decision for me....

Thing I've ran into is QC isn't always reliable because truthfully they don't really pay much attention like the weather enthusiast reporting data does.
Randy, the Aviator is my father in 1963 with his Indian bike

Offline Old Tele man

  • Singing in the rain...
  • Forecaster
  • *****
  • Posts: 1035
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #13 on: November 08, 2016, 12:46:15 PM »
As long as it was bad and not being flagged as bad because of someone else and is chronically reporting bad data I would want it removed from the data stream. Not a tough decision for me....

Thing I've ran into is QC isn't always reliable because truthfully they don't really pay much attention like the weather enthusiast reporting data does.
"Personal Pride" is a wonderful thing, ain't it!
« Last Edit: October 12, 2017, 04:23:04 PM by Old Tele man »
SYS: Davis VP2/WL-IP & Envoy8X/WL-USB
CWOP: DW6988 - 2 miles NNE of Cortaro, AZ
WU - KAZTUCSO202, Countryside

Offline azchrisf

  • Senior Member
  • **
  • Posts: 67
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #14 on: October 12, 2017, 02:50:39 PM »
I think another thing the CWOP and MADIS can do is offer the ability to specify unique conditions to a site. That would help the algorithms adjust in the long run IMHO, because it's not lumping everything into a "all or nothing" type analysis.

 

anything