Pentaho database join slower than single query in IDE

52 Views Asked by Pleasant94 At 21 September 2023 at 14:52

I'm using a PDI transformation which consists of different steps, and one of them is a database join who gets just 1 value as input, execute the query (only 1 time) then the results are manipulated and so on. That step is enormously and painfully slow (over 40 mins).

The same query via Toad by Oracle is executed in 1-2 seconds. I tried with and without each optimization hints, nothing changes, there is a lot of performance difference between PDI and the single execution via Toad.

SELECT /*+PARALLEL(4)*/
    COUNT(T.ID_VAL) AS number_of_occurences, T.name
    FROM TABLE T
    WHERE     1 = 1
    AND T.col1 = 'S'
    AND T.col2 IS NULL
    AND T.name IN
    (
        SELECT /*+INDEX OTHER_TABLE GROUP_ID_INDEX*/ DISTINCT TAB_O.name
        FROM OTHER_TABLE TAB_O
            WHERE     1 = 1
            AND TAB_O.GROUP_ID = ?--'10052023'--PASSED VALUE
            AND TAB_O.col1 = 'M'
    )
    GROUP BY T.name

where T.name is the same as TAB_O.name.

The structure of my transformation is:

The parameters of the step are:

Why there is this difference?

An example of the two tables coud be:

OTHER_TABLE:

GROUP_ID	COL1	NAME
1	S	John
1	M	Mark
2	S	Marc
2	M	Jenny
2	S	John
10052023	S	Al
10052023	M	John
10052023	M	Jack

TABLE:

ID_VAL	NAME
1	Al
2	Al
3	John
4	Jack
5	Jack
6	Jack

So I want as a result:

number_of_occurences	name
1	John
3	Jack

I think the problem lies with pentaho preparing queries, or computational overhead, I don't know how Pentaho executes queries.

Adding some infos about the environment, I am using a local instance of a Pentaho on a VM, which connects to the repository hosted on another server.

Original Q&A

Pentaho database join slower than single query in IDE

There are 0 best solutions below

Related Questions in PENTAHO

Related Questions in DATABASE-PERFORMANCE

Related Questions in PDI

Trending Questions

Popular # Hahtags

Popular Questions