I am using TiDB version 6.5.1, I wanted to know the behavior of TiKV, let's say I have a table t which contains composite index example (a,b), when I am trying to execute following cases,
1.select sum(c) from t where a=123 and b='simple' group by b;
range scan happing as expected.
2.select sum(c) from t where a>=123 and b='simple' group by b;
here I am passing indexed columns in where clause, So why range scan is not happening and table full scan can causes performance issues when table size is big.
3.select sum(a) from t where a>=123 and b='simple' group by b;
if I use indexed column in select range scan happening.
4.select sum(c) from t where a>=123 group by a;
same behavior like 2.
I have a requirement like can pass total index or left prefix index with >=, <=, between, like operators to support ad-hoc query, So TiKV will support this without full table scan?
Please suggest table design if any changes required, here I am planning to use TiKV + TiSpark to cover entire HTAP usecase.
Thanks,
Ajay Babu Maguluri.
TiDB like any database has an optimizer that based on limited data (statistics) and in limited time needs to find an acceptable execution plan.
The table scan might be cheaper than other plans. You can restrict the plans the optimizer can take with hints and see what the cost is for each plan.
Here the optimizer uses a
IndexRangeScanand then in the second query we exclude the index and then it takes aTableFullScanwhich is much more expensive as you can see in theestCostcolumn.For questions like this it might be useful to share the output of
SHOW CREATE TABLE...for the tables involved. The data or some description of the data (e.g. how unique the 123 value is) would also be helpful.From the images you posted it looks like you have an index called
a_b_indexon(a, b). This means that TiDB can't use the second column of the index if you don't have a equality (=) match on the first column. Switching the order might be good as for the queries here you always do a equality match on thebcolumn and a range match on theacolumn. But I can't see the full range of queries that you do so other queries might perform worse after this.Here is some good explanation about this: https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys