Author Topic: A short Dissertation On Quality control By Steve Dimse  (Read 15825 times)

0 Members and 1 Guest are viewing this topic.

Offline Beaudog

  • Forecaster
  • *****
  • Posts: 1217
A short Dissertation On Quality control By Steve Dimse
« on: March 06, 2014, 10:20:18 AM »

 Just for the record, I had this problem a few days ago as well. QC flagged my dewpoint, even though my station was fine. One day of errors, and then everything went back to normal.

I'd like to try again to convince people that chasing QA is not worthwhile. Most of you probably don't know my life history, but my first career was as a professor of emergency medicine for the University of Miami. All profs have an area where they spend a little extra time to become the local expert of something. With my interest in computers and math I naturally became the UM SoM expert in quantitative analysis and critical analysis. The skill doesn't precisely carry over to meteorology, but the principles do.

The first thing I want to share is how we look at lab tests. If you test some parameter in a group of normal humans, say sodium, you will get a very wide range of values even if your equipment works perfectly. So we rather arbitrarily assume that the values will follow a bell shaped distribution (they don't) and draw the line for normal such that 95% of the healthy populations fits in there. This is really important, because it means that fully 5% of healthy people have abnormal results for this and any other test. Furthermore if you run a panel of 20 lab tests like we love to do nowadays, on average every healthy person will have one abnormal value. This is a big reason why docs don't want you to have your lab results - it takes us forever to convince people that lab values marked abnormal by the lab are actually normal.

This transfers over to CWOP QC only in the general sense, but there is something to be learned. If we expected to have perfect equipment, and if there was a way the QC computer could know exactly what the weather actually was in your location, then a bad QC means there is a problem. But clearly this is not reality.

QC is calculated from surrounding weather data. You might have an acceptable error in one direction, but if everyone around you randomly has an acceptable error in the other direction you will get flagged. But no one has a problem to fix or to fret over. And it is not surprising that humidity is a place where this occurs because it is the hardest of the weather parameters to measure. In medicine we call this a false positive. People die from false positives, when a false positive test value is followed up by an invasive test that causes a complication. That is why we train young doctors to be smart about how they look at data.

The medicine analogy only stretches so far. Now imagine the world was perfect and there were no false positives. The QC computer does know exactly what the weather is at your location, and compares it to your perfectly functioning weather station and always gets zero difference. We don't need CWOP then. The computer already has the answers and we are wasting our time.

We want your weather data precisely because it differs from what the QC predicts. We know the QC computer is not perfect. We know that every sensor, even professional quality sensors, will not be perfect. The more data we get from real world flawed sensors the easier it becomes to extract actual values. THAT is what CWOP is about.

Maybe you recall the late winter 2007 incident in Baltimore Harbor. A tourist pontoon boat flipped in a freak wind gust on a beautiful blue sky day killing five. CWOP had fewer stations then, but we caught a signal from four of this event. It started about 30 minutes beforehand with a long slight increase in wind speed, the closer stations showed the peak getting higher and narrower, just like a wave cresting on a beach. We looked at the data to see if there might be something there that could have provided warning, but the signal was too weak. However, the signal would have tripped the QC, because this wind was not expected or seen by other stations. So this interesting event was not an instrumentation failure, even if the QC failed!

QC is there to look for systemic errors. The saturated humidity sensor that reads 20% high. The exposed temperature sensor that warms up 20 degrees between 2 and 4 pm when the sun hits it. The rainfall gauge that never records rain even though everyone around is in a monsoon. Those are the sorts of things that you need to be checking your QC for. But the occasional value that falls a little outside the arbitrarily set QC cutoff and then returns is not a problem. It is just the normal sort of thing that pops up when you gather an analyze huge amounts of data!

Steve K4HG

Offline George Richardson

  • WxElement panel
  • Forecaster
  • *****
  • Posts: 1391
    • Smith Mountain Lake Weather
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #1 on: March 06, 2014, 10:53:48 AM »
Heaven Help Us!  There is Some sanity in this world!  =D>

George

Offline Yfory

  • Senior Member
  • **
  • Posts: 65
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #2 on: March 06, 2014, 02:40:04 PM »
This statement by Steve Dimse on QA/CWOP, and variations in the bell curve as well as outliers of the bell curve, is the best summation of statistical data analysis I have seen - especially as it applies to weather.

We all have differences with CWOP at times and we even have striking differences with neighboring stations at times. The trick is to know when we are statistically (consistently) out of line and when we are a simply a bit off the bell curve in an isolated event.

As an example Steve's statement is especially true with humidity. I have studied humidity sensors and most manufactures try to make a humidity sensor a linear device and it is not. It is non-linear at both ends.  So as a humidity sensor reading approaches 95% it goes into a (non-linear) curve. And at extreme ends of most sensors there are even measurable variations from sensor to sensor. It is just to complex to to write code to compensate for the last 5% of humidity (the equation would be at least 3rd order) so we just step outside, understand that the humidity is high, and move on.

The "short Dissertation On Quality control By Steve Dimse" should be the first consideration whenever we find data "out of line".

Most people here do consider data variations carefully and the real problem is when do we decide the sensor is producing erroneous data?

Thank you Steve!!!

Offline Old Tele man

  • Singing in the rain...
  • Forecaster
  • *****
  • Posts: 1365
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #3 on: December 13, 2014, 01:58:25 PM »
My CLIFF NOTES explanation is that NWS uses our CWOP data to *teach* their weather prediction software, so we (CWOP) are (literally) Guinea Pig exercises for their larger goal of unified weather prediction. So:

• Garbage In, Garbage Out.
• Mixed data In, mixed predictions Out
• Skewed data In, skewed predictions Out
• Good data In, eventually good data Out
 
• SYS: Davis VP2 Vue/WL-IP & Envoy8X/WL-USB;
• DBX2 & DBX1 Precision Digital Barographs
• CWOP: DW6988 - 2 miles NNE of Cortaro, AZ
• WU - KAZTUCSO202, Countryside

Offline W3DRM

  • Forecaster
  • *****
  • Posts: 3360
    • Emmett Weather
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #4 on: December 13, 2014, 02:38:53 PM »
Steve, great explanation =D>

Statistics and analysis of weather data is great. You state "QC is there to look for systemic errors". However, one of the biggest issues I have (and have had for quite some time now) are the CWOP Barometer QC charts and resulting analysis they give. When I first started submitting to and receiving CWOP QC data, my barometric charts almost always showed the green two-thumbs up indicators. Sometime later, perhaps a year or so ago, there was a very apparent change in either the algorithms used for the QC analysis or the inclusion of other "questionable" barometric instrument inputs. The charts literally went from what would be considered normal barometer readings to charts with rapidly changing barometer readings. Those changes were, and still are, totally unrealistic due to the nature of how atmospheric pressure changes occur. Simply put, the charts were reporting huge changes that couldn't be true. The folks at CWOP, for whatever reason, have defended those changes and refuse to extract/reject the datum that is causing these fluctuations. The result of this has been that many of us have simply given up trying to keep our equipment calibrated as well as totally ignoring most of the CWOP QC reports we see. There are many posts elsewhere in this forum regarding this subject.

It is my not-so-humble opinion, presenting data that is obviously flawed is worse that having no data at all. For someone new installing their first weather station, they have no reliable means of calibrating their equipment via the CWOP system. CWOP used to be a great tool for all of us to use but not any longer as it cannot be trusted to provide valid data.

I get many emails from folks who view my weather website. Most of them ask why I don't fix my barometer. Thus, I am very close to simply removing any references to CWOP from my website.
Don - W3DRM - Emmett, Idaho --- Blitzortung ID: 808 --- FlightRadar24 ID: F-KBOI7
Davis Wireless VP2, WD 10.37s150,
StartWatch, VirtualVP, VPLive, Win10 Pro
--- Logitech HD Pro C920 webcam (off-line)
--- RIPE Atlas Probe - 32849

Offline Old Tele man

  • Singing in the rain...
  • Forecaster
  • *****
  • Posts: 1365
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #5 on: December 13, 2014, 03:07:20 PM »
Mark Twain: "...the BEST government money can BUY...", so apparently NOAA is far down on the money-funding list (wink,wink).
« Last Edit: June 26, 2016, 02:47:55 PM by Old Tele man »
• SYS: Davis VP2 Vue/WL-IP & Envoy8X/WL-USB;
• DBX2 & DBX1 Precision Digital Barographs
• CWOP: DW6988 - 2 miles NNE of Cortaro, AZ
• WU - KAZTUCSO202, Countryside

Offline WheatonRon

  • Forecaster
  • *****
  • Posts: 1237
    • WUnderground
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #6 on: September 23, 2016, 09:02:42 PM »

 Just for the record, I had this problem a few days ago as well. QC flagged my dewpoint, even though my station was fine. One day of errors, and then everything went back to normal.

I'd like to try again to convince people that chasing QA is not worthwhile. Most of you probably don't know my life history, but my first career was as a professor of emergency medicine for the University of Miami. All profs have an area where they spend a little extra time to become the local expert of something. With my interest in computers and math I naturally became the UM SoM expert in quantitative analysis and critical analysis. The skill doesn't precisely carry over to meteorology, but the principles do.

The first thing I want to share is how we look at lab tests. If you test some parameter in a group of normal humans, say sodium, you will get a very wide range of values even if your equipment works perfectly. So we rather arbitrarily assume that the values will follow a bell shaped distribution (they don't) and draw the line for normal such that 95% of the healthy populations fits in there. This is really important, because it means that fully 5% of healthy people have abnormal results for this and any other test. Furthermore if you run a panel of 20 lab tests like we love to do nowadays, on average every healthy person will have one abnormal value. This is a big reason why docs don't want you to have your lab results - it takes us forever to convince people that lab values marked abnormal by the lab are actually normal.

This transfers over to CWOP QC only in the general sense, but there is something to be learned. If we expected to have perfect equipment, and if there was a way the QC computer could know exactly what the weather actually was in your location, then a bad QC means there is a problem. But clearly this is not reality.

QC is calculated from surrounding weather data. You might have an acceptable error in one direction, but if everyone around you randomly has an acceptable error in the other direction you will get flagged. But no one has a problem to fix or to fret over. And it is not surprising that humidity is a place where this occurs because it is the hardest of the weather parameters to measure. In medicine we call this a false positive. People die from false positives, when a false positive test value is followed up by an invasive test that causes a complication. That is why we train young doctors to be smart about how they look at data.

The medicine analogy only stretches so far. Now imagine the world was perfect and there were no false positives. The QC computer does know exactly what the weather is at your location, and compares it to your perfectly functioning weather station and always gets zero difference. We don't need CWOP then. The computer already has the answers and we are wasting our time.

We want your weather data precisely because it differs from what the QC predicts. We know the QC computer is not perfect. We know that every sensor, even professional quality sensors, will not be perfect. The more data we get from real world flawed sensors the easier it becomes to extract actual values. THAT is what CWOP is about.

Maybe you recall the late winter 2007 incident in Baltimore Harbor. A tourist pontoon boat flipped in a freak wind gust on a beautiful blue sky day killing five. CWOP had fewer stations then, but we caught a signal from four of this event. It started about 30 minutes beforehand with a long slight increase in wind speed, the closer stations showed the peak getting higher and narrower, just like a wave cresting on a beach. We looked at the data to see if there might be something there that could have provided warning, but the signal was too weak. However, the signal would have tripped the QC, because this wind was not expected or seen by other stations. So this interesting event was not an instrumentation failure, even if the QC failed!

QC is there to look for systemic errors. The saturated humidity sensor that reads 20% high. The exposed temperature sensor that warms up 20 degrees between 2 and 4 pm when the sun hits it. The rainfall gauge that never records rain even though everyone around is in a monsoon. Those are the sorts of things that you need to be checking your QC for. But the occasional value that falls a little outside the arbitrarily set QC cutoff and then returns is not a problem. It is just the normal sort of thing that pops up when you gather an analyze huge amounts of data!

Steve K4HG

This discussion is nearly two years old. Steve, anything change?
« Last Edit: November 07, 2016, 09:23:03 PM by WheatonRon »
Davis VP2 with SHT31 (3 complete VP2 systems—2 with a daytime fan and 1 that has a 24 hour fan); CWOP--CW5020, FW3075 and FW4350; WU--KILWHEAT17, KILWHEAT36 and KILWHEAT39; WeatherCloud.net; CoCoRaHS--IL-DP-132; and Weatherlink 2.0

Offline WheatonRon

  • Forecaster
  • *****
  • Posts: 1237
    • WUnderground
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #7 on: November 07, 2016, 08:44:46 PM »

 Just for the record, I had this problem a few days ago as well. QC flagged my dewpoint, even though my station was fine. One day of errors, and then everything went back to normal.

...

QC is there to look for systemic errors. The saturated humidity sensor that reads 20% high. The exposed temperature sensor that warms up 20 degrees between 2 and 4 pm when the sun hits it. The rainfall gauge that never records rain even though everyone around is in a monsoon. Those are the sorts of things that you need to be checking your QC for. But the occasional value that falls a little outside the arbitrarily set QC cutoff and then returns is not a problem. It is just the normal sort of thing that pops up when you gather an analyze huge amounts of data!

Steve K4HG

This discussion is nearly two years old. Steve, anything change?

I gather nothing has changed or Steve has not seen this post. Related to quality, there is a nearby PWS by me (not too far from Midway airport in Chicago) that is reporting hugely incorrect temperatures--over 100 degrees since November 1. Global warming is really here! Seriously, I have contacted the PWS owner and have tried to help him but he is slow to respond. His station is DW7813 in the CWOP network and on Mesowest  http://mesowest.utah.edu/cgi-bin/droman/meso_base.cgi?stn=d7813

Yes, both CWOP and Mesowest confirm his data is bad, but it still gets posted on the National Weather Service site. This is where common sense should prevail and CWOP or the NWS should pull his station (at least his temperature reading) from its website--makes a mockery out of all of us!


« Last Edit: November 08, 2016, 08:36:02 AM by WheatonRon »
Davis VP2 with SHT31 (3 complete VP2 systems—2 with a daytime fan and 1 that has a 24 hour fan); CWOP--CW5020, FW3075 and FW4350; WU--KILWHEAT17, KILWHEAT36 and KILWHEAT39; WeatherCloud.net; CoCoRaHS--IL-DP-132; and Weatherlink 2.0

Offline ValentineWeather

  • Forecaster
  • *****
  • Posts: 6364
    • Valentine Nebraska's Real-Time Weather
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #8 on: November 07, 2016, 09:18:03 PM »
Humm! Only 128° F with a DP of 128°...Heat index 476 F / 247 C.
Randy

Offline WheatonRon

  • Forecaster
  • *****
  • Posts: 1237
    • WUnderground
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #9 on: November 07, 2016, 09:27:05 PM »
Humm! Only 128° F with a DP of 128°...Heat index 476 F / 247 C.

Wow!! That is warm and muggy but CWOP and the NWS are posting that data on their respective websites!  Great news!
« Last Edit: November 07, 2016, 11:22:16 PM by WheatonRon »
Davis VP2 with SHT31 (3 complete VP2 systems—2 with a daytime fan and 1 that has a 24 hour fan); CWOP--CW5020, FW3075 and FW4350; WU--KILWHEAT17, KILWHEAT36 and KILWHEAT39; WeatherCloud.net; CoCoRaHS--IL-DP-132; and Weatherlink 2.0

Offline C5250

  • Forecaster
  • *****
  • Posts: 840
    • Local weather
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #10 on: November 07, 2016, 11:42:06 PM »
Humm! Only 128° F with a DP of 128°...Heat index 476 F / 247 C.

Wow!! That is warm and muggy but CWOP and the NWS are posting that data on their respective websites!  Great news!

While this would be a clear example of something is obviously wrong; would you want CWOP to stop reporting your data because it didn't pass QC?
Precious little in your life is yours by right and won without a fight.

Offline WheatonRon

  • Forecaster
  • *****
  • Posts: 1237
    • WUnderground
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #11 on: November 07, 2016, 11:48:31 PM »
Humm! Only 128° F with a DP of 128°...Heat index 476 F / 247 C.

Wow!! That is warm and muggy but CWOP and the NWS are posting that data on their respective websites!  Great news!

While this would be a clear example of something is obviously wrong; would you want CWOP to stop reporting your data because it didn't pass QC?

This is a difficult question to answer. When data is wrong, so bad, like this the data coming from this PWS, it should not be published on any Internet site.  I understand the difficulty in finding a line in the sand, but clearly this station has crossed it.
« Last Edit: November 08, 2016, 08:38:04 AM by WheatonRon »
Davis VP2 with SHT31 (3 complete VP2 systems—2 with a daytime fan and 1 that has a 24 hour fan); CWOP--CW5020, FW3075 and FW4350; WU--KILWHEAT17, KILWHEAT36 and KILWHEAT39; WeatherCloud.net; CoCoRaHS--IL-DP-132; and Weatherlink 2.0

Offline ValentineWeather

  • Forecaster
  • *****
  • Posts: 6364
    • Valentine Nebraska's Real-Time Weather
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #12 on: November 08, 2016, 12:39:38 PM »
As long as it was bad and not being flagged as bad because of someone else and is chronically reporting bad data I would want it removed from the data stream. Not a tough decision for me....

Thing I've ran into is QC isn't always reliable because truthfully they don't really pay much attention like the weather enthusiast reporting data does.
Randy

Offline Old Tele man

  • Singing in the rain...
  • Forecaster
  • *****
  • Posts: 1365
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #13 on: November 08, 2016, 12:46:15 PM »
As long as it was bad and not being flagged as bad because of someone else and is chronically reporting bad data I would want it removed from the data stream. Not a tough decision for me....

Thing I've ran into is QC isn't always reliable because truthfully they don't really pay much attention like the weather enthusiast reporting data does.
"Personal Pride" is a wonderful thing, ain't it!
« Last Edit: October 12, 2017, 04:23:04 PM by Old Tele man »
• SYS: Davis VP2 Vue/WL-IP & Envoy8X/WL-USB;
• DBX2 & DBX1 Precision Digital Barographs
• CWOP: DW6988 - 2 miles NNE of Cortaro, AZ
• WU - KAZTUCSO202, Countryside

Offline azchrisf

  • Cobra Weather Dominator Operator
  • Forecaster
  • *****
  • Posts: 455
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #14 on: October 12, 2017, 02:50:39 PM »
I think another thing the CWOP and MADIS can do is offer the ability to specify unique conditions to a site. That would help the algorithms adjust in the long run IMHO, because it's not lumping everything into a "all or nothing" type analysis.
Davis Vantage Pro 2 Plus 6163 w/ 8 Transmitters!
Also doing Soil and Leaf 4x
WU: KAZGLEND106 CWOP: FW1398 (F1398) Purpleair: 98793/LAZGLEND8
My setup:
https://www.wxforum.net/index.php?topic=41867.0

Offline biker57

  • Member
  • *
  • Posts: 4
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #15 on: June 09, 2020, 03:42:20 PM »
My station is now active with CWOP and also set up with MADIS, but in the process my hardware info is not showing and is throwing a QC error.  I emailed Philip Gladstone at the email address at the bottom om the page as instructed to have it corrected, but so far no replies after a week. Does anyone have a suggestion or somewhere else I should sent this info to.

Offline galfert

  • Global Moderator
  • Forecaster
  • *****
  • Posts: 6822
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #16 on: June 09, 2020, 05:28:45 PM »
My station is now active with CWOP and also set up with MADIS, but in the process my hardware info is not showing and is throwing a QC error.  I emailed Philip Gladstone at the email address at the bottom om the page as instructed to have it corrected, but so far no replies after a week. Does anyone have a suggestion or somewhere else I should sent this info to.

You are not uploading directly to CWOP. You are using AmbientCWOP service as a middle man. This middle man service is new and is not yet recognized by Gladstonefamily.net as to what type of hardware is providing this data. Hence it just gets listed in Red and it just says AmbientCWOP.com. But don't expect that to change...see below why.

You have not been uploading but a week yet since final registration was processed. It takes time for MADIS QC analysis data to actually start to be processed for a new station. In other words you are too new and there is no Gladstonefamily.net analysis for your station yet. Your current analysis error is that there is no analysis data...yet.
You aren't even on the map yet:
https://madis-data.ncep.noaa.gov/MadisSurface/

It won't be that your analysis QC will be a week delayed in the future...just saying it takes a few days to a bit over a week to be fully acquired and synced across all databases. Gladstonefamily.net usually does this once a week on Wednesdays, you just missed that last week cut off because your final registration was processed last Tuesday and it takes a few days after that to be fully official...hence you need to wait for the Wednesday of this week to finally see analysis data. Oh and when you do start to see analysis data, ignore the wind analysis as that does not work.

And forget about trying to reach Philip via email. Gladstonefamily.net is like a ghost ship without a captain. It is what it is, very dated and seemingly abandoned, so don't expect anything to change. There is nobody else to reach. With Gladstonefamily.net you take it or leave it. Many people just choose to ignore it and instead just use Mesowest and MADIS surface map for QC.

The only reason I was able to provide any of this detailed information is because I did some sleuthing to find out what station you are. It is a bit of work to connect all the dots and find out who you are (several steps are required). I doubt most other users would even bother or even know how. Next you need help please provide your station ID so we can dig in and see what is occurring easily and you'll likely get others to chip in and provide their expertise. What you typically see when people ask for help without providing their station ID is that someone may ask you to provide it. Then you respond and then they look. Days can transpire with all that back and forth. More efficient to just start off saying who you are. I understand that you just wanted to know whom else to contact so you weren't really thinking that this forum was going to be the place to actually get the information you really needed.
« Last Edit: June 09, 2020, 05:54:44 PM by galfert »
Ecowitt GW1000 | Meteobridge on Raspberry Pi
WU: KFLWINTE111  |  PWSweather: KFLWINTE111
CWOP: FW3708  |  AWEKAS: 14814
Windy: pws-f075acbe
Weather Underground Issue Tracking
Tele-Pole

Offline funsutton

  • Contributor
  • ***
  • Posts: 140
    • Weather Underground
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #17 on: June 09, 2020, 06:37:37 PM »
Quote
The only reason I was able to provide any of this detailed information is because I did some sleuthing to find out what station you are.

White Hat sleuther no doubt!

Offline biker57

  • Member
  • *
  • Posts: 4
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #18 on: June 09, 2020, 08:51:22 PM »
Thanks for the detailed info I really appreciate it.

My station is FW7529(F7529)

Offline jkline

  • Member
  • *
  • Posts: 15
    • PaloAltoWeather.com
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #19 on: June 14, 2020, 01:38:16 PM »
...and instead just use Mesowest and MADIS surface map for QC.
Hi Galfert,

I’m aware of Mesowest QC; but I have no idea about using MADIS surface map for QC.  I would appreciate it if you would fill me in on how to do this?

Cheers,
John Kline
F4751
https://www.paloaltoweather.com/

Offline galfert

  • Global Moderator
  • Forecaster
  • *****
  • Posts: 6822
Re: A short Dissertation On Quality control By Steve Dimse
« Reply #20 on: June 14, 2020, 02:51:06 PM »
I’m aware of Mesowest QC; but I have no idea about using MADIS surface map for QC.  I would appreciate it if you would fill me in on how to do this?

https://www.wxforum.net/index.php?topic=35001.0
Ecowitt GW1000 | Meteobridge on Raspberry Pi
WU: KFLWINTE111  |  PWSweather: KFLWINTE111
CWOP: FW3708  |  AWEKAS: 14814
Windy: pws-f075acbe
Weather Underground Issue Tracking
Tele-Pole