I too had this problem and updated (to 3.1.3). Since I've had the problem once in the past 20 hours. I believe the solution is not complete. Here is what I know based on communication from Ambient:
- The code in the firmware is single threaded. This means that while an HTTP request is outstanding to (for example) Wunderground.com and the reply has not yet been received, no other code runs. This includes code to receive data wirelessly from the indoor sensor package.
- To prevent an unresponsive server from completely stopping all code forever, a timeout is built in so that if no response is received for a certain time interval, the request is considered failed and reporting will have to wait until the next scheduled attempt. I am not 100% sure what interval the ObserverIP is using to update WU. Likewise I do not know their (fixed) timeout interval.
- Inspecting network traffic I can see that in my case the WU update frequency is about once every two seconds (version 3.1.3 firmware). Oddly enough it uses the parameter "rtfreq=5" to WU which I would have expected to be 2 as well then. Or, in other words, why attempt this every two seconds when you are indicating a resolution no better than 5? I suspect mine ends up being 2 seconds because my requests fail (see below).
- The move, on or around March 1, of the Wunderground infrastructure from private physical servers to the Amazon Web Services infrastructure introduced different delays in the API request/response cycles, one of which is used by the ObserverIP to deliver sensor data to your account. This delay is likely a function of running on different hardware (virtualized), and possibly not enough scaling so that high (transient) load can cause longer delays than what we were used to on the private infrastructure.
So now, consider the following scenario with firmware ≤ 3.1.0, which has a relatively short timeout interval. At time 0 the ObserverIP attempts to report its currently know set of sensor values (indoor+outdoor) to Wunderground. It starts a request. The new Amazon infrastructure is busy or, in general, takes a little longer that the timeout interval (which had been previously set to a value that would rarely, if ever, occur). The request fails and is immediately retried (or at least is retried before the indoor sensors are serviced). After a few retries, the request finally succeeds. Meanwhile, however, no indoor sensor data was collected (because of the single threaded issue). Because of the delays, it is already time for the next scheduled upload to WU, and a new request is done, but it now contains "--" for the indoor sensor values (I have observed this only for pressure though).
So, Ambient's solution in the new firmware is to increase the timeout interval. While this appears to solve the issue, it really does not do so in all cases. Here is why. You would have to make the timeout large enough to handle the worst case scenario for the Amazon infrastructure, or the above scenario repeats. Nobody can of course be sure in the absolute sense of what this largest value should be, but assume something is found that covers 95% of the new scenarios. That means that occasionally the problem can still occur. If they choose a larger value, it might be 99% or 99.99%, but...
The longer the timeout, if there is an actual long delay, the code will do nothing else until this now much longer time has expired. This increases the likelihood the indoor sensors are not serviced. Once the timeout happens, a retry still prevents handling of the indoor sensors like above. If retries do not happen immediately (I don't know if this is the case or not), things can still fail, depending on how the handling of the indoor sensor data retrieval is handled. If this is scheduled on a time basis, the delay may have caused the collection time to have passed, and we'll have to wait for the next scheduled one. Meanwhile, the WU request might be scheduled again. Simply manipulating the timeout value cannot completely solve the problem and the single-threaded nature makes things worse.
Note, BTW that while such delays happen, everything slows down. It also explains why sometimes the Live Data page seems sluggish...
While I was investigating, I noticed another problem too. I actually use MeteoBridge to read values from the ObserverIP and send them to WU and other places. Consequently I have not configured my ObserverIP to send to WU by leaving station ID and password blank. When I first read about the problems and the new firmware I thought I would not be affected because of this. My ObserverIP had no reason to interact with WU. Boy was I wrong. I inspected network traffic through my router and discovered that the ObserverIP nevertheless sends a sensor data upload request to WU at regular intervals. The stationID and password are passed as empty, so this request is bound to fail, and it does. However, because of this I (and in fact all of us) am still subject to the delay induced issues. Of course it also wastes bandwidth. In my case request plus fail response consume about 0.5KB and doing this every two seconds means 43,200 times a day, using 21MB each day. I have a great connection, so not that big a deal, but for some remote locations this might matter. I have proposed to Ambient that if either stationID or password is empty they not attempt this at all.
I have written to Ambient and proposed a better solution for the timeout issue by implementing a dynamically controlled timeout interval (similar to what the TCP protocol does) so that the timeout is minimal until problems develop and then it scales up (to a certain maximum) to see if that solves the issue and scales back down when appropriate. They may also have to make changes to the scheduling of the indoor unit values retrieval scheduling to guarantee it always happens before a WU request is done.
Even if the suggestion is implemented, users like myself may have a problem because the MeteoBridge (and also Ambient's version of this) gets values by requesting the Live Data page and scraping the values out of it. It is, therefore, possible to request these before the indoor data has been grabbed. If Ambient implements the optimization around empty stationID, the odds of this happening will be severely reduced because this request is the main source of delays.
So the combination of both suggestions, if implemented, will likely bring all of us back to a normal or much improved scenario.