Many IoT applications store many types of data points for each set of readings. For this example, we'll show a wide-format table schema, and only a few columns of data to keep things simple. Data typically comes in a few times a minute in time order, although older data can arrive when trucks lose their connection to cell service or transmitters break down. The reading hypertable stores all data that is delivered from every truck over time. truck_idįor the queries below, we'll pretend that this table has ~10,000 trucks, most of which are currently active and recording data a few times a minute. Even for a very large company, this table will typically contain only a few tens of thousands of rows. This table tracks every truck that is part of the fleet. Although the app would certainly be more involved and have a more complex schema for tracking both time-series and business-related data, let's focus on two of the tables. ![]() ![]() ![]() Sometimes the truck loses signal which causes data to be sent a few hours or days later. In order to demonstrate how PostgreSQL might use an index on a large time-series table, let's set the stage with a set of fictitious data.įor these example queries, let's pretend that our application is tracking a trucking fleet, with sensors that report data a few times a minute as long as the truck has a cell connection. It doesn't always use the index exactly how you might expect as we'll discuss below. The answer to that lies in how the PostgreSQL query planner works. In our TimescaleDB Slack community channel and in other developer forums such as StackOverflow ( example), developers often wonder why a query for the latest value is slow in PostgreSQL even when it seems like the correct index exists to make the query "fast"? PostgreSQL actually supports many different index types that can help for various types of queries and data (including timestamp-centric data), but from here on out, we're only talking about B-tree indexes. These are the most common index supported by all major OLTP databases and they are very good at locating specific rows of data across tables large and small. For the duration of this post, all references to indexes specifically mean a B-tree index. There are always nuances that we don't have time to get into in this post (don't create too many indexes, make sure statistics are kept up-to-date, etc.), but generally speaking, the right index will dramatically improve the query performance of a SQL database, PostgreSQL included.īefore we dive into how to efficiently find specific records in a large time-series database using indexes, I want to make sure we're talking about the same thing. With the appropriate index, PostgreSQL is normally very efficient at retrieving data for your query. In most cases, the answer to that is emphatically "true". When the queries aren't as fast as we expect, it's easy to be confused because indexes in PostgreSQL are supposed to help your queries return quickly - correct? We study the data, determine the appropriate schema, and create the indexes that should make the queries return quickly. Knowing how to query the most recent timestamp and data for a device in large, time-series datasets, is often a challenge for many application developers. As you read, focus on the concept of each option, rather than the specific data we're using as an example. ![]() Note: Throughout this post, references to a "device" or "truck" are simply placeholders to whatever your application is storing time-series data for, whether it be an air quality sensor, airplane, car, website visits, or something else. Each option has its advantages and disadvantages, which we'll discuss as we go. In this blog post, we'll explore five methods for accessing the most recent value in PostgreSQL. One of the most frequent queries applications make on time-series data is to find the most recent value for a given device or item. Time-series data is ubiquitous in almost every application today.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |