Incremental Database Records Fetch in NiFi Based on Last Execution

How can I fetch records from a database table incrementally in Apache NiFi, ensuring that the next run of SQL only fetches new records added since the last execution? I’m using NiFi to execute SQL queries and want to avoid fetching data redundantly. Please provide some way for implementing this incremental fetching mechanism, including the necessary configurations and SQL modifications to achieve efficient and reliable results. Thank you!

To fetch records from a database table incrementally in Apache NiFi, you can use either the QueryDatabaseTable or QueryDatabaseTableRecord processors. Before configuring these processors, you need to set up a Database Connection Pool service and select the appropriate database type.

Here are the steps to achieve incremental fetching:

  1. Configure the processor properties:
    • Set the “Table Name” property: Provide the name of the table you want to fetch data from. If there are multiple tables, you can use either a JOIN of tables or define a view in the database that combines multiple tables and then provide the view name.
    • Set the “Columns To Return” property: Specify the column names you want to retrieve from the table. Use a comma-separated list, similar to the SELECT statement in SQL.

  1. Define the “Initial Load Strategy”:

    • Choose “Start at beginning” if you want to fetch all records from the table on the first execution. This will retrieve all the data from the specified table or view.
    • Choose “Start at current maximum value” if you want to fetch only new records added since the last execution. This option will ignore the existing records and focus on fetching new data.
  2. Configure “Maximum-Value Columns”:

    • Specify the column name that will be used for incremental fetching. The processor will use the maximum value of this column from the last execution and store it in its state. During the next execution, it will fetch records with values greater than the stored maximum value of this column.

By configuring the processor with the appropriate options and maximum-value columns, you can achieve efficient and reliable incremental fetching of records from the database table. This approach allows you to avoid redundant data retrieval and ensures that only new records are fetched in subsequent executions.

1 Like