Weather Software > Station Software Development
MySQL help, shrink database by combining multiple rows into max/min/average
(1/1)
lightmaster:
I have a database that has my weather written to it every 10 seconds. After around 10 months, I now have a little over 1.4 million records, so I'd like to try to compress that a bit to improve performance and reduce the size over long term. I'd like to shrink the data that's older than 1 month by combining the data into 5 min intervals, using things like the max temp in the 5 min window, average wind direction, etc. So in a 10 min period, instead of having 60 records with 10 seconds between them, I'd have 2 records with 5 minutes between them.
I've used queries like this to see data based on a time range, but not sure how to do a query that loops based on the timestamp and creates another table with that new data.
--- Code: ---SELECT MIN(TempInCur), MAX(TempInCur), MIN(HumInCur), max(HumInCur), min(PressCur), max(PressCur), min(TempOutCur), max(TempOutCur), min(HumOutCur), max(HumOutCur), min(DewCur), max(DewCur), min(HeatIdxCur), max(HeatIdxCur), max(WindSpeedCur), max(WindGust10), max(WindAvgSpeedCur), min(WindChillCur), max(WindChillCur), MAX(RainYear), MAX(RainRateCur), max(UVIndex), MAX(Evapotranspiration), MAX(Radiation) FROM `KGABAXLE9` WHERE `DateTime` >= '2019-01-01' ORDER BY `DateTime` DESC LIMIT 1
--- End code ---
If it helps, here's the format of my table:
--- Code: ---CREATE TABLE `KGABAXLE9` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`DateTime` datetime NOT NULL COMMENT 'Date and Time of Readings',
`TempOutCur` decimal(4,1) NOT NULL COMMENT 'Current Outdoor Temperature',
`HumOutCur` int(11) NOT NULL COMMENT 'Current Outdoor Humidity',
`PressCur` decimal(4,2) NOT NULL COMMENT 'Current Barometric Pressure',
`DewCur` decimal(4,1) NOT NULL COMMENT 'Current Dew Point',
`HeatIdxCur` decimal(4,1) NOT NULL COMMENT 'Current Heat Index',
`WindChillCur` decimal(4,1) NOT NULL COMMENT 'Current Wind Chill',
`TempInCur` decimal(4,1) NOT NULL COMMENT 'Current Indoor Temperature',
`HumInCur` int(11) NOT NULL COMMENT 'Current Indoor Humidity',
`WindSpeedCur` decimal(4,1) NOT NULL COMMENT 'Current Wind Speed',
`WindAvgSpeedCur` decimal(4,1) NOT NULL COMMENT 'Current Average Wind Speed',
`WindDirCur` int(11) NOT NULL COMMENT 'Current Wind Direction (Degrees)',
`WindDirCurEng` varchar(3) NOT NULL COMMENT 'Current Wind Direction (English)',
`WindGust10` decimal(4,1) NOT NULL COMMENT 'Max Wind Gust for Past 10 Mins',
`WindDirAvg10` int(11) NOT NULL COMMENT 'Average Wind Direction (Degrees) for Past 10 Mins',
`WindDirAvg10Eng` varchar(3) NOT NULL COMMENT 'Average Wind Direction (English) for Past 10 Mins',
`RainRateCur` decimal(5,2) NOT NULL COMMENT 'Current Rain Rate',
`RainDay` decimal(4,2) NOT NULL COMMENT 'Total Rain for Today',
`RainYest` decimal(4,2) NOT NULL COMMENT 'Total Rain for Yesterday',
`RainMonth` decimal(5,2) NOT NULL COMMENT 'Total Rain this Month',
`RainYear` decimal(5,2) NOT NULL COMMENT 'Total Rain this Year',
`Radiation` decimal(6,2) NOT NULL COMMENT 'Current amount of Solar Radiation (W/m^2)',
`Evapotranspiration` decimal(5,2) NOT NULL COMMENT 'Solar Evapotranspiration',
`UVIndex` decimal(3,1) NOT NULL COMMENT 'Current UV Index (uvi)',
`ForecastRule` int(11) NOT NULL COMMENT 'Forecast number from Davis',
PRIMARY KEY (`ID`),
KEY `TempInCur` (`TempInCur`),
KEY `TempOutCur` (`TempOutCur`),
KEY `HumOutCur` (`HumOutCur`,`PressCur`,`DewCur`,`HeatIdxCur`,`WindChillCur`,`HumInCur`,`WindSpeedCur`,`WindGust10`,`RainDay`,`RainYest`,`RainMonth`,`RainYear`),
KEY `DateTime` (`DateTime`)
) ENGINE=InnoDB AUTO_INCREMENT=1410532 DEFAULT CHARSET=utf8
--- End code ---
davidg_nz:
I don't know how well MySQL implements SQL92 date/time stuff but this is pretty easy to do in PostgreSQL. I use a query similar to below in my own software for computing 30 minute averages where the time-stamp is the time at the start of the 30 minute period (the javascript chart library I used was struggling with a month of data at 5 minute resolution).
--- Code: ---with weather_data as (
select
*,
(extract(epoch from "DateTime"::date) + extract(hour from "DateTime")::int) * 10 + extract(minute from "DateTime")::int / 5 as period
from "KGABAXLE9"
)
SELECT
min("DateTime"),
MIN("TempInCur"),
MAX("TempInCur"),
MIN("HumInCur"),
max("HumInCur"),
min("PressCur"),
max("PressCur"),
min("TempOutCur"),
max("TempOutCur"),
min("HumOutCur"),
max("HumOutCur"),
min("DewCur"),
max("DewCur"),
min("HeatIdxCur"),
max("HeatIdxCur"),
max("WindSpeedCur"),
max("WindGust10"),
max("WindAvgSpeedCur"),
min("WindChillCur"),
max("WindChillCur"),
MAX("RainYear"),
MAX("RainRateCur"),
max("UVIndex"),
MAX("Evapotranspiration"),
MAX("Radiation")
FROM weather_data
group by period
order by period desc
--- End code ---
Note that the :: syntax above ("DateTime"::date) is a Postgres-specific cast shortcut - the correct way in most other database engines would be cast("DateTime" as date).
Navigation
[0] Message Index
Go to full version