Author Topic: Some more info for your AcuRite driver  (Read 10053 times)

0 Members and 1 Guest are viewing this topic.

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Some more info for your AcuRite driver
« on: August 28, 2015, 09:15:52 PM »
I've noticed that wind speed decoding is not quite right for the acurite.py driver and I think I've just figured out the proper decoding. See the post below for the details:

http://www.wxforum.net/index.php?topic=27244.0

Just my way of saying thanks for helping me with my own decoding on AcuRite through all the work that folks have put into the acurite.py driver. Hope this helps!

Cheers,
            aweatherguy

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #1 on: August 28, 2015, 09:31:03 PM »
P.S. If you change the return statement in decode_windspeed function to this I think you'll be good to go:

return 0.8278 * (a | b) + 1.00

Offline mwall

  • Contributor
  • ***
  • Posts: 135
Re: Some more info for your AcuRite driver
« Reply #2 on: August 28, 2015, 10:18:26 PM »
thanks for that!  the implementation we were using seemed to be off a bit, but i never would have guessed the actual function.

the change will be in the next weewx release, which will probably be 3.2.2.

m

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #3 on: August 29, 2015, 02:18:35 AM »
No problem. Don't forget about treating zero special...I probably should have posted this snippet instead. Don't know if I have the Python syntax correct, but the idea's there.

if (a | b) == 0:
   return 0.0
else:
   return 0.8278 * (a | b) + 1.00

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #4 on: September 02, 2015, 10:56:31 PM »
Another update on AcuRite console decoding.

First I might be on to something in terms of being able to work around all the garbage that the 02032C console generates if you enable PC connect mode 3. I'll post more on this if it pans out (odds are probably around 50/50 at this point, but I'm certainly not counting on it).



Secondly, I've figured out two of the records in input report 3 -- with ID's 3 and 5. They are both time stamps, and I think that was already known. But what might be useful is knowing that ID 3 is the last time a history record was stored, and ID 5 is when the history query was received. So far, ID 5 always seems to be later than ID 3 but never by more than 12 minutes (the history update interval).

So, why is this useful? Well, if for instance ID 3 says the last time history was updated was 9/2/2015 10:01 and ID 5 says the history request time is 9/2/2015 10:08 it means that you could have requested history 7 minutes earlier and still gotten a current update.

In summary, you can use the difference between time stamps on record IDs 3 and 5 to fine tune your timing when querying history updates. This way you can be sure to get history updates that are say less than 1 minute old instead of taking you chances and maybe winding up with an update that's 11 minutes old.

So while this may or may not end up working on 02032C consoles, if the other consoles (w/o the firmware problem) still emit record IDs 3 and 5 it would be useful in dealing with them.

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #5 on: September 05, 2015, 03:28:44 AM »
Just another update. I'm still learning more about the protocol here.

In the weewx driver there is separate treatment for barometer calculations depending on whether the console is 02032C or not. I think this distinction can be removed. Here's why:

1) By examining the new app from AcuRite, it appears that the pressure from 02032C consoles is in 1/16th mb increments with a 208mb offset... so pressure in mb = value / 16 - 208.

2) However, if you just treat the R2 record as if it were from a different console and perform the calcs for an HPS03 pressure sensor, you get the same exact answer from using (1) above.

Bottom line is I think you can ignore the difference in consoles and just perform the same calcs for pressure on all consoles. Of course, you should check this for yourself before making the changes.

Here is a little further support for this idea: I've noticed that there is absolutely zero difference in the USB descriptors for two different types of consoles (e.g. 2032C versus 1035). USB Vendor ID/Product IDs are identical as is the entire descriptor contents. I think if AcuRite were going to change data interpretation between these consoles they would have changed something in the HID device descriptors or product ID or something so that their own software could distinguish.

What they seem to have done with the new 2032C console is to massage the pressure readings from the new barometer to be compatible with the data format in prior consoles.

P.S. I think my attempts to write a driver that can "survive" operation with a 2032C console in PC Connect mode 3 might actually be paying off. It may or may not be possible to dump large amounts of history data, but once the history buffer is cleared out I think that streaming real-time data from the console (including indoor temp/humidity every 12 minutes) might be doable. There is still a lot of testing to do, but this is looking at least possible now.

Offline mwall

  • Contributor
  • ***
  • Posts: 135
Re: Some more info for your AcuRite driver
« Reply #6 on: September 05, 2015, 09:24:59 AM »
1) By examining the new app from AcuRite, it appears that the pressure from 02032C consoles is in 1/16th mb increments with a 208mb offset... so pressure in mb = value / 16 - 208.

2) However, if you just treat the R2 record as if it were from a different console and perform the calcs for an HPS03 pressure sensor, you get the same exact answer from using (1) above.

that sounds like the 02032C is just doing a linear approximation of what the HPS03 sensor is doing.

i am inclined to keep the distinction for now, for two reasons:

1) it is good to know whether we are dealing with a 02032C or some other hardware 01035, 01036, etc.

2) the hardware that uses the HP03 sensor sometimes reports wacky constants, but the reported pressures still seem to be ok.  detecting this will help us get a better idea of other hardware variations and behaviors.

P.S. I think my attempts to write a driver that can "survive" operation with a 2032C console in PC Connect mode 3 might actually be paying off. It may or may not be possible to dump large amounts of history data, but once the history buffer is cleared out I think that streaming real-time data from the console (including indoor temp/humidity every 12 minutes) might be doable. There is still a lot of testing to do, but this is looking at least possible now.

there are two ways in which weewx might use historical data from the acurite stations:

1) catchup.  when weewx starts up, it reads any historical data from the station since the latest timestamp in the weewx database.

2) use the station's archiving instead of weewx software archive record generation.

there is a third reason we might want to read historical records: (3) to get indoor humidity.

doing (1) has marginal value because: the station has very little memory, it defaults to mode 2.

doing (2) has even less value because: it happens every 12 minutes, keeping the data logger turned on (mode 3) causes usb communication issues not only on the 02032 but also on the 01035/01036 (just not as often or repeatably).

so (3) is the primary reason i would consider reading historical records.

whether we end up doing (1) and/or (3), we basically need two things:

a) how to read records with surgical precision so that we do not get stuck in firmware usb timing problems

b) how to decode the historical records

m

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #7 on: September 05, 2015, 04:42:31 PM »
Matt, thanks for the discussion. You bring up good points.

I agree that it's good to know if you have the 2032C console or not. That said I suggest you modify the pressure formula as mentioned (value / 16 - 208mb) and it will match exactly what the new acurite software is decoding.

One thing to be aware of going forward is that some day acurite may release a new console which reports the same constants in R2 but does not have the firmware problems of the 2032C console. Today however, this test works.



Regarding the reliability here's a bit more info. I've done several tests now with a 1035 console and it NEVER spits out any corrupt data in R3. I'm taking a leap here but I suspect that the 2032C console is the only one that spits corrupt data.

I have come up with an algorithm that seems to tolerate the corrupt data in R3 w/o locking up communications with the console. I won't try to document it here, but it involves identifying corrupt input reports from the console and sending repeated requests until good data is received. The trick seems to be in how all of this is timed, especially when a request needs to be repeated. I'm at the point now of running some very long (perhaps as long as several days) streaming sessions with the 2032C console to see if anything locks up.

There is also one other test I need to perform -- getting the data logger either full or close to it, then make sure I can dump the history data and return to a streaming algorithm where you only get one new history record every 12 minutes. I have done this now (dumping history) with up to 58 history records and it sometimes takes a couple of repeated attempts but eventually works. I do know that it will probably not be possible to successfully download more than 40-50 history records w/o corruption. I just need to discover if it is possible to dump a full history w/o user intervention.

If this (dumping of large amounts of history data) does not work then streaming would still be possible but folks might have to manually reset the console if it had logged more than a certain amount of data. When the console sends out the ID5 time stamp portion of history, that seems to be the point where it resets its internal pointers so you don't get the big dump next time.

None of this is a problem with the 1035 console -- it does not emit corrupt data (not that I've seen yet, anyway).

Within the R3 data, comparing the ID3 and ID5 time stamps does indeed seem to work to fine tune timing of R3 queries and you can get it so that you are querying R3 no more than one minute after it has updated.



Finally, I have worked out most of the data in the history segment of R3 (that's the ID4 sub-record). There are just a couple of bytes I'm not sure of but have successfully decoded the following in both 1035 and 2032C consoles:

indoor temp/rh
outdoor temp/rh/dp/wind chill/heat index
barometer
current wind dir and speed
peak wind and avg wind speeds
rain event date/time and rain event amounts.

So, as far as getting indoor temp/rh that's done.




Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #8 on: September 05, 2015, 08:35:26 PM »
Here's the decoding info for history (input report 3). Much of this was already either known or guessed at. I've confirmed the location of all of the fields on both 2032C and 1035 consoles. As 1035 consoles don't spit corrupt data it should be low risk to add this for all but 2032C consoles.

Input Report 3 is broken into sub-records by the "0xaa, 0x55" separator sequence. There is a separator sequence prior to the first sub-record, but none after the last sub-record. Each sub-record contains a six byte header:

Sub-record Headers

Byte(s)   Contents
=======================================
0      Sub-record ID (1-6 are valid)
1,2    Unknown, always seem to be zero
3,4    Size of the sub-record in "chunks"
5      Checksum -- this is the total of bytes 0..4 minus one

A note on the size field: This seems to indicate how many "chunks" follow, but the size of the "chunk" is assumed and depends on the sub-record ID:

ID(s)     Chunk size
=======================
1,2,3,5    8 bytes
4         32 bytes
6         meaningless -- ID6 never contains any data but a size of "4" is always declared.


For all but sub-ID 6, the total sub-record size should be equal to (6 + chunk_size x size)

To the best of my knowledge this is what each sub-record contains:

ID      Contents
======================================
1,2   These seem to contain 8-byte chunks of historical min/max data.
        Each 8-byte chunk appears to contain two data bytes plus a 5-byte time stamp
        indicating when the min or max event occurred
3       This time stamp indicates when the most recent history record was stored,
        according to the console's clock.
4       History data.
5       This time stamp indicates when the request for history data was received.
6       This only seems to be an end marker indicating that no more data follows in the input report.

Time stamp sub-records (IDs 3,5) contain 8 data bytes as follows.

Byte   Contents
=============================================
0,1   for sub-ID 3, the number of history records available when the request was received.
0,1     for sub-ID 5, unknown
2   year - 2000
3   month
4   day
5   hour
6   minute
7   for sub-ID 3, a checksum - sum of bytes 0..6 (do not subtract one)
7   for sub-ID 5, unknown it is just 0xff



12-minute History Records

This applies to the sub-record in Input Report 3 with an ID code of "4". Bytes The first six bytes is the sub-record header and bytes 3,4 contain the number of history records to follow within the sub-record; call this "N". After stripping off the 6-byte record header you should be left with N x 32 bytes. If not then data got corrupted or you had an incomplete transfer.

After removing the header (6 bytes) break the remainder into 32-byte chunks. The description below applies to each 32-byte chunk. I believe that records are sent in most-recent-first order, so the time stamp on sub-record ID 3 applies to the first 32-byte chunk and each following record is another 12 minutes into the past.

Byte(s)   Mask   Contents
========================================================
 0-1             Indoor temp F = (value-1480)/10
 2-3             Outdoor temp F = (value-1480)/10
 4               so far, always zero.
 5               Indoor RH %
 6               so far, always zero.
 7               Outdoor RH %
 8-9             Wind Chill F = (value-1480)/10
 10-11           Heat Index F = (value-1480)/10
 12-13           Dew Point F = (value-1480)/10
 14-15   0x07ff  Pressure in mb
 16              see below
 17      0xf0    always zero so far on all consoles
 17      0x0f    Current wind direction (decode per the known wind direction map)
 18-19   0xffff  Current wind speed km/hr = value / 16
 20-21   0xffff  Peak wind speed in km/hr = value / 16
 22-23   0xffff  Average wind speed in km/hr = value / 16
 24-25           Amount of rain in rain event
 26-30           Date/time of rain event, yy,mm,dd,hh,mm (all 0xff if there is no rain event)
 31              see below

Notes:

Bytes 4,6 have only been seen to be zero on any console

Byte 16 is always zero on the 2032 console, but seems to  be a copy of byte 21 on the 1035 console.
        perhaps this is a check byte?

Byte 31 is always zero on the 2032 console, but seems to be a copy of byte 30 on the 1035 console.
        could this be another check byte?


Wind speed data from 1035 console contains fractional amounts -- the least significant nibble is
not always zero. The 2032 console only shows integer values for wind speeds.



For only getting at indoor RH, query history every 12 minutes and pluck out byte[5], discarding everything else. The first query may return many history entries (emptying the logger memory) but every 12-minute query after that will only return one history entry.


Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #9 on: September 06, 2015, 03:51:08 AM »
whoops -- after reading your post more carefully, I'll say that I have not yet seen the 1035 console spit out corrupt data.

Do you know how often this happens or any more about exactly what happens? Does it need a lot of history data or do you just need to run it a long time? Is there more discussion of this on one of the google threads?

Offline mwall

  • Contributor
  • ***
  • Posts: 135
Re: Some more info for your AcuRite driver
« Reply #10 on: September 06, 2015, 10:42:41 AM »
whoops -- after reading your post more carefully, I'll say that I have not yet seen the 1035 console spit out corrupt data.

Do you know how often this happens or any more about exactly what happens? Does it need a lot of history data or do you just need to run it a long time? Is there more discussion of this on one of the google threads?

there are two issues:

1) periodic lockup of the USB communications.  this happens only when the console is in usb mode 3, and only sometimes when i try to get R3 data.  sometimes a software usb reset will fix it, sometimes unplug/replug the usb connector, sometimes power cycling the console.

2) corrupt data.  i see corrupt R1 and R2 data from the 01036 station periodically.  typically there will be 2 or 3 corrupt messages per day, but that number goes up if the connection between console and instrument cluster is bad.  it also goes up if the console is in usb mode 3.

no other discuss about this - just observations i have made while poking at the station.  it has happened with the data logger full and with the data logger empty.

reading R1 and R2 is easy - just do a controlMsg with appropriate report number and number of bytes, then you get the data back from the station.

the R3 is not so easy since it is more than a single controlMsg read. 

when i talk about R3 data i mean this: a single R3 read consists of 33 bytes.  a full R3 message consists of multiple R3 reads.

to get R3 data, i first send controlMsg with HID_OUTPUT_REPORT, then follow that with 18 R3 reads (the R1, R2, and R3 reads use HID_INPUT_REPORT).  i can do 18 consecutive R3 reads pretty consistently.  the lockups happen when i try to do more than 18 consecutive R3 reads, but not always.  exactly how much memory is in the station?  is a block of 18 R3 reads the entire station history (18*33 = 594 bytes)?

also, the HID_OUTPUT_REPORT returns two bytes - do you know what those mean?

my intent has been to get a full set of R3 reads to work properly, repeatably, without lockup.  but i'm not there yet.

m

Offline mwall

  • Contributor
  • ***
  • Posts: 135
Re: Some more info for your AcuRite driver
« Reply #11 on: September 06, 2015, 12:23:37 PM »
I agree that it's good to know if you have the 2032C console or not. That said I suggest you modify the pressure formula as mentioned (value / 16 - 208mb) and it will match exactly what the new acurite software is decoding.

One thing to be aware of going forward is that some day acurite may release a new console which reports the same constants in R2 but does not have the firmware problems of the 2032C console. Today however, this test works.

there are now two configuration options in the driver:

use_constants = (True | False)
ignore_bounds = (True | False)

the default is now False for each.  so by default the driver will not use the calibration constants - it will use the linear function.  if someone want to use the calibration constants, set use_constants=True.  if someone wants to use calibration constants and they have one of the 01035/01036 consoles with flaky constants, they can set use_constants=True and ignore_bounds=True.

btw, hats off to andrew daviel for reverse engineering that one.  he came up with this function:

p = 0.062585727 * d1 - 209.6211

m

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #12 on: September 06, 2015, 05:10:35 PM »
Matt,

More great information -- thanks and keep it coming!

To start with I am hoping that I have an algorithm here that prevents the USB comms lockup. I can't say that for sure just yet because there's not enough test time, but I have not seen any lockups yet with either console. I just need to run it with both consoles long enough to be sure it is working.

The algorithm also identifies corrupt data and ignores it and yes I do see corrupt R1 and R2 data occasionally with the 2032C console. Not yet with the 1035 but it should still be able to identify bad data. BTW, most of what I see with 2032 console is truncated records -- they're not as long as they should be. The algorithm also performs some sanity checks on data values.

The wild rain data issue is not known to me but I suspect it is either caused by a truncated input report #1 (where the rain data gets interpreted after merging with previous data stored in the buffer) or by actual bad data. Either way I think it is possible to detect this and filter it out.

I don't know about the I/O routines used in weewx, but in Windows you don't get any return value telling you how many bytes were actually transfered by the input report. What I've had to do is fill the buffer with a fixed value (0xEE seems to work) that does not normally appear in any input report. Then by examining the buffer (starting at the end and working backwards), I can figure out how many bytes were actually transfered. In Windows at least, this is crucial to being able to spot corrupt data.



The output report "response" you ask about is actually data being sent from the PC to the console -- Wireshark seems to mis-identify this as data going the other way. The first byte there (0x01) is just the Output Report number, "1". The second byte is data for the report and I don't know what it means -- but all AcuRite software to this point sends 0xff for the data byte.

My belief (not yet disproven by observation) is that sending this output report tells the console to reset its internal buffer pointer as to where it will start sending history from. In other words it's a request to re-start the history report from the beginning. From that point every time you ask for Input Report 3, the console sends you (ideally) the next 32 bytes of data in the history data structure.



What I've discovered is that if you see some bad data you can do this:

Stop asking for more IR3's, wait a while and then set the output report again. Now if you start asking for IR3's you will get history from the beginning again. However, if the console manages to send the sub-record ID 5 (e.g. you see 0xaa, 0x55, 0x05 in the data stream) then the console will reset history pointers and the next request will only get new history entries. This is important because if there is a full logger, then getting the sub-record ID 5 tells you that the console has "dumped" history and you won't get huge IR3's anymore.



The R3 reading algorithm is a little involved so I'll just try to give you the overview at this point.

First, I go easy on trying to get R3. When R3 is due (every 12 minutes) I don't ask for it unless there have been some minimum number of valid and consecutive R1 and R2 queries (e.g. 3). If I get a corrupt R1 or R2 this count gets reset and the R3 query will get delayed.

Next, when reading R3, watch the data as it comes in. I'll skip over the details of exactly what to reject, but what the algorithm is looking for is the first history entry to be valid and the appearance of the sub-record ID5 and/or ID6. If the first history entry is okay then I have what I want (don't care about older entries) and the rest of the history being trashed is okay. Once I see the ID5 or ID6 sub-records then I know that history is "dumped". Today I managed to dump 62 history records from a 2032 console on the first try.

If something goes wrong in the R3 data stream, then stop asking for more input report 3's. Don't try to send an output report right away to restart the process. Wait for at least 3 consecutive good R1 and/or R2 queries before sending another output report to restart the process.

Finally, when this is happening, the first contents of Input Report 3 may contain some garbage -- looks like it might be left over from the previous attempt or something. I ignore anything up to the first 0xaa,0x55 separator before paying attention to the data. Also, sometimes I'll see a very short sub-record ID1 immediately followed by another ID1 sub-record. This is tolerated and I just keep reading data.



In summary I've left out some detail to keep the message from being too complex but hopefully you get the gist of where this is heading...



Offline jonkjon

  • Senior Member
  • **
  • Posts: 66
    • K2mm Weather
Re: Some more info for your AcuRite driver
« Reply #13 on: September 07, 2015, 11:28:29 AM »
When you say that your 1035 console never spits out any bad data on R3, would that also apply to rain data? I am getting frequent (~ every other day) phantom rain / rain_rate data from my 1035 console using the latest build of weewx. The console itself never shows these phantom events however. I am currently using acurite-0.19.py.

--Jon

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #14 on: September 07, 2015, 06:47:49 PM »
Well, I'm going to have to take that back now -- about the 1035 console being "perfect".  :oops:

I ran overnight with a computer streaming from the 1035 console (all 3 input reports) and did see one place where invalid data appeared. It was in a history record but unfortunately I made a mistake in the test program and it did not actually dump the offending data. Rats.

Anyway, I'm now on board with the idea that the 1035 console does emit bad data, but just a lot less frequently than the 2032 console. I fixed the test program; it's running now and it should log whatever bad data comes along so after perhaps a few days or a week I'll know more about the how and where of this bad data.

I am still fairly certain that it will be possible to spot the bad data and ignore it so that you won't get the wild rain values. However that's not 100% certain and it will just have to wait until I have more test data. Are you running in PC Connect mode 4 now?

I have not had any lockups yet that required manual intervention but that also is probably not meaningful until I've had at least a week of solid streaming from the console.



One last thing: After some analysis I know why and where the new AcuRite app locks up when reading the 02032 console. I even know what they need to do to make it a lot more robust. However, I'm not going to say any more here because AcuRite has been soooooo steadfast in not commenting on inquiries about firmware bugs (other than to say that they don't know of any). If there's no firmware bug, then they certainly don't need any outside help in making their new app more tolerant of non-existent bugs  ;)


« Last Edit: September 07, 2015, 06:51:31 PM by aweatherguy »

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #15 on: September 07, 2015, 07:52:13 PM »
While I'm running my tests (in Windows/.NET), here's something that someone could try with the weewx driver (on Linux, I'm assuming).

To detect some of this bad data, you need to figure out how many bytes are actually received when you call the "handle.controlMsg()" function to read an input report in your "read" function. This "controlMsg()" function seems to allocate the buffer so you don't have the opportunity to fill it with a pattern like 0xEE prior to the transfer. This is of course a pain because HID devices are supposed to only return input reports whose length matches the value declared in the descriptor.

See if you can figure out how to supply a buffer to controlMsg(), as I suspect that the "controlMsg()" function will probably return only 6 bytes in a 10-byte buffer if the AcuRite console sends a truncated input report (that's what happens with the Windows API function to get an input report).

Offline mwall

  • Contributor
  • ***
  • Posts: 135
Re: Some more info for your AcuRite driver
« Reply #16 on: September 07, 2015, 08:31:46 PM »
controlMsg does not always return the number of bytes one would expect.  for example, an R1 read should return 10 bytes.  if we get a message that starts with 01 but it has length other than 10 bytes, we detect that and ignore the message.

the phantom rain is coming from spurious jumps of the rain counter.  the packets appear to be fine - they are the correct length, and all values are within sensor limits.  but the rain counter makes big jumps up and down over the course of a day or hour or minute (no pattern yet to the spurious jumps).

btw, phantom rain makes three issues outstanding for the weewx driver:

1) being able to read R3 reports without borking the usb comms, in a way that works on both 01035/01036 and 02032 hardware
2) being able to detect bogus R1, R2, or R3 packets (since acurite was too stupid to include checksums)
3) being able to detect spurious jumps in the rain counter

m

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #17 on: September 08, 2015, 06:34:45 PM »
Well, you are better off with the controlMsg function then...in Windows I have to pre-pack the buffer to figure out if the length is short. Either way works okay though.

I'm running a 1035 console continuously now in PC Connect Mode 3; it's been going for the better part of a day. No short packets yet on R1 or R2 and history still reads okay. I'll peruse the logged rain data to look for jumps and let you know if I find any.

Per your (1), I'm still hopeful that the algorithm I've got right now is going to be robust in terms of not locking up. It will just take time to find out. Will keep you posted.

on (2), detecting bogus R3 is fairly easy I think because there is so much of it, but I don't know about (3). Hopefully I can capture some of that and figure something out.

Again at this point its just a matter of logging more hours to see what happens. Will let you know when something pops.

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #18 on: September 09, 2015, 02:37:57 PM »
Regarding corrupt data in input report 1 (R1):

Let me make sure I understand correctly that this has nothing to do with PC Connect Mode 3 versus 4 -- it occurs in both modes...is that correct?

Don't forget that the 5n1 sensor does not use a rolling code, so the console might pick up another nearby sensor if it is on the same channel. I don't know what sort of filtering the console tries to do to avoid this but if you are seeing occasional bad rain data I would try changing the channel on sensor and console to see if that helps.



I did see one corrupt R1 record from the 1035 console last night. So far from anything normal it would easily be filtered out. Here it is:

R1: 95 39 E3 01 1D 01 2B 07 00

The length is correct but just about everything else is wrong.



I have figured out a little more of the fine detail in R1 records which is contained in the attached PDF file. This is a couple pages from a document I'm working on about the USB protocol. I'll make the entire doc available when it gets a bit further along.

There is a proposed set of checks you can perform on R1 in there. I don't think the weewx driver is doing all of these checks right now (but I could be wrong on that). If not it might help to add some more checks on R1.



As an aside, assume the console attempts to filter other nearby sensors by only accepting transmissions within a narrow window every 18 seconds (once it finds the sensor). Assume a nearby interfering sensor is transmitting 1 second after the local (desired) sensor and the console is filtering that out because it is outside the acceptable time window. If the interfering sensor's clock is about 200ppm (parts-per-million) slower than the desired sensor, then in one day, it will have drifted by 17 seconds to where it is transmitting right on top of the desired sensor. I don't know what tolerances AcuRite is using on their sensor clock crystals, but a +-100ppm or 200ppm is spec is not unusual.
« Last Edit: September 09, 2015, 02:39:56 PM by aweatherguy »

Offline mwall

  • Contributor
  • ***
  • Posts: 135
Re: Some more info for your AcuRite driver
« Reply #19 on: September 09, 2015, 03:05:38 PM »
Regarding corrupt data in input report 1 (R1):

Let me make sure I understand correctly that this has nothing to do with PC Connect Mode 3 versus 4 -- it occurs in both modes...is that correct?

short reads happen whether in mode 3 or mode 4, but short reads are easy to detect and ignore.

corrupt data happens for both R1 and R2.  in the case of R1, it is easy to catch if the values are outside the official sensor ranges.  the nasty ones are when there is clearly corrupt data, but all values are within sensor capabilities.  i'm pretty much giving up on those - we have other mechanisms in weewx for dealing with that (e.g., spike detection/rejection).

i have seen corrupt R1 in both mode 3 and mode 4.  sometimes while in mode 3 the station gets wedged - it sends the same (apparently valid) R1 until you do a usb reset.  this is on a 01036.  does not always happen - just when i've been abusing it by asking it for R3 data.

corrupt R2 are easy to catch as well.  for 01035/01036 stations, the constants have well-defined limits, and for 02032 stations the constants-placeholders have known values.  i see corrupt R2 on a 01036 station maybe once every day to once every week.  the weewx driver notes the apparently corrupt R2, ignores its data, then notes again when a 'sane' R2 is sent again.  i see it much more often in mode 3 than in mode 4.

the spurious rain is a special case.  the behavior is that the rain counter changes to a lower value (so it looks like a counter wraparound, but the magnitude is wrong), stays there for awhile, then at some point returns to its previous value.  the return to previous value looks like a rain event.

btw, this spurious rain counter behavior happens in the fine offset stations as well.  we solved the problem there by watching the memory near the rain counter.  it turns out that the spurious counter changes were coupled with the timing of reading memory - if you do a read while the station is updating memory, you get what looks like a rain event.  but not always.

I have figured out a little more of the fine detail in R1 records which is contained in the attached PDF file. This is a couple pages from a document I'm working on about the USB protocol. I'll make the entire doc available when it gets a bit further along.

There is a proposed set of checks you can perform on R1 in there. I don't think the weewx driver is doing all of these checks right now (but I could be wrong on that). If not it might help to add some more checks on R1.

thanks for that.  at some point it would help to see your code.  its one thing to describe an algorithm, but, as you know, actual code tells all.

everything that i officially know, as well as the most recent implementation, is maintained in the weewx acurite driver here:

https://github.com/weewx/weewx/blob/master/bin/weewx/drivers/acurite.py

many thanks for your hard work on the decoding.

m
« Last Edit: September 09, 2015, 03:08:06 PM by mwall »

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #20 on: September 09, 2015, 04:18:28 PM »
I already spotted a couple of mistakes in the R1 tables there -- wrong byte masks for temperature and I left out RH. I'll fix those locally here but you already know those fields anyway.

Have you tried changing transmit channels (the ABC switch) with the spurious rain problem? It would be good to rule that possibility out if you have not already done so. If you were getting interference from another 5n1 sensor I think that's just what it might look like. Obviously, if the problem is not an interfering sensor then the channel switch won't fix anything.

I think my algorithm is fairly stable now -- running with 01035 for a couple days now -- for testing purposes at least. I'll work on making the Windows .NET test app available so you can see the code. It would be good if you could run it on your console for a while looking for bad data, lockups etc. (Sorry, but the app won't run on Linux because it makes P/Invoke calls into the Windows API)

Glad to help out with this and it also helps a lot to have a two-way discussion and hear others experiences.


Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #21 on: September 09, 2015, 11:49:57 PM »
Just saw some corrupt R3 data on the 1035 console. The console was being queried every 12 minutes for history and at one point spewed a time stamp of:

00 01 0A 0C 06 00 00 1D = Dec 6, 2010 00:00 (midnight) -- the clock is not set correctly on the console

along with a good history entry. Then 12 mins later on the next history report query came this time stamp:

00 01 0A 0C 06 00 0C 29 = Dec 6, 2010 00:12 (12 mins after midnight)

along with a completely trashed history entry. As the invalid history entry was recognized, the algorithm stopped reading input report 3 and instead (after a delay and some successful queries of other input reports) sent another output report to reset the history query.

The next attempt to get history was failed by the algorithm because the first part of the input report had some left-over junk from the last attempt. So, another delay and another output report.

Third time however is a charm; it works and retrieves a valid history entry for the time stamp of 12 minutes after midnight.

In summary, the algorithm spotted some bad data mid-report and was able to successfully restart the history querying operation. It recovered a valid history entry after sorting through some junk and nothing locked up in the process. Very encouraging...

P.S. Don't know if this is a function of being the first history entry after midnight; time will tell on that one.

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #22 on: September 10, 2015, 06:56:24 PM »
mwall -- I just sent you a PM with a link to the source code for my acurite console test app.

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #23 on: September 12, 2015, 01:42:26 AM »
Minor update -- 36 hours now on the 01035 console. I had 13 truncated R2 reports (23 instead of the expected 24 bytes received), zero errors on R1 and no unusable history R3 reports (using the corruption-tolerant algorithm). No lockups. There may or may not have been any truncated R3 records but I was not tracking that. Zero glitches in rainfall data.

Have you had any chance to investigate whether the rain glitches might be caused by another nearby 5n1 sensor?

I'm starting to think this data corruption tolerant algorithm is going to do the job. I'm beginning the work to implement this into the WSDL software. If you do get a chance to run the test app, I would be very interested in any results -- good or bad!

Offline aweatherguy

  • Senior Contributor
  • ****
  • Posts: 288
    • Weather Station Data Logger
Re: Some more info for your AcuRite driver
« Reply #24 on: September 14, 2015, 02:25:41 PM »
Good news. Just downloaded about 4 days of history (461 history entries) from 02032C console using this algorithm. As expected, there were some (i.e. 21) corrupted history records but I got the most recent one okay (the one I wanted) and managed to read to the point that the console reset its internal history pointers. And, no lockups.

I've learned that corrupt USB reports average around 4% with this console. One history record (32 bytes) is also one HID input report (32 bytes). So, 21 bad records / 461 total records = 4.5% -- just about on par with the average.



I've also been running the algorithm on a 01035 console for about 4 days w/no lockups. Will check for glitches in the data and let you know what I find.