Skip to end of metadata
Go to start of metadata

Overview

By default, Flux will store workflows (and related data like actions and variables) in the in-memory Derby database. For more control over the way Flux stores its data, you can configure Flux to use an external database as well.

For more information on configuring the engine to use an external database, see Configuring Flux to run with Databases.

For more information on the purpose of the Flux database tables, see Explanation of Flux Database Tables.

Database Deadlock

From time to time while running workflows, the engine may encounter an SQL exception indicating database deadlock. If a deadlock occurs, the workflow (or flow context) is not lost - instead, the database transaction associated with the flow context is rolled back and the flow context is re-executed.

Database deadlocks during workflow execution do not require any manual intervention — any flow context that encounters a deadlock will be automatically re-executed.

If you encounter a database deadlock while making a client call to the engine (using the Flux API), your code will need to re-execute the failed call. To avoid this, you can wrap your engine so that it automatically retries failed client calls. See the wrapEngine() method of the flux.Factory class for more information.

Minimizing Deadlocks

We also recommend disabling any of the following engine configuration options if you are not using their associated features (run history and workflow deadlines respectively):

RUN_HISTORY_ENABLED=false
FLOW_CHART_DEADLINES_ENABLED=false

The sections below also provide more information for reducing deadlocks for their specific database types.

If you see deadlocks occurring more frequently than once per hour (on average), or if you can reproduce a deadlock regularly by following a well-defined sequence of steps, email our Technical Support department at support@fluxcorp.com with an explanation of the deadlock situation. Our team will work with you to attempt to reduce the number of deadlocks to an appropriate tolerance.

Row-level Locking and Lock Escalation

Flux requires row-level locking when running with a database. The Flux engine is known to experience poor performance and deadlocks when the database does not support row-level locking. Row-level locking should be enabled at the database level when using Flux.

This settings is needed for SQL Server, for more info, refer to the SQL Server wiki.


In addition, the lock escalation threshold for the database should be set as high as possible. Due to the highly concurrent nature of Flux, a high lock escalation threshold is vital to the engine's performance.

Derby

Flux uses the built-in Derby database by default. When using the built-in Derby database, Flux automatically configures Derby to ensure the best performance for the Flux engine in terms of deadlocking, speed, and memory usage. See JVM Configuration for specifics.

In typical scenarios, Derby is expected to occupy about 300 MB of heap memory space.

Derby is appropriate for a light load of workflows and file transfers. If you are using Flux for anything but a light workload, we recommend configuring Flux to run with an external database.

Oracle

byte[] Size Restrictions in Persistent Variables

When persistent variables are stored in the database, the Java type byte[] maps to the VARBINARY database type. If this VARBINARY database type is mapped to an Oracle BLOB database column, Oracle will truncate the binary data in the byte[] field to 64 KB (this is caused by the JDBC APIs that Flux must use to communicate with the database). For this reason, if you are storing byte[] fields in your persistent variables, and you are using an Oracle database, make sure that each byte[] field holds less than 64 KB of data.

If you need to store more than 64 KB of data, one simple technique to avoid this limitation is to spread your byte[] over several byte[] fields (less than 64 KB each).

Multiple Database Users Running Separate Flux Instances

If you have multiple database users on Oracle running separate Flux instances and your user permissions are not set up properly, you may see exceptions like "ORA-00942: table or view does not exist". Often, this exception occurs when your Flux instance can see the Flux tables created by another user when it queries the database's metadata. These errors will then occur when Flux issues SQL statements at runtime, because those tables do not exist in the Flux user's user space.

If you do not have permission to use another user's tables, you will need to make that you cannot view that user's table information when querying the database metadata. You can also change the table prefix that Flux will attempt to use:

TABLE_PREFIX=MY_USER_FLUX

The default table prefix is "FLUX_". If you change the table prefix, make sure that you recreate your database tables using the correct prefix to avoid conflicts.

LOB DBMS Package

You must enable the LOB DBMS package in your Oracle database for Flux to function correctly. If this is not enabled, you may encounter errors like "Audit trail events could not be retrieved because: 'java.sql.SQLSyntaxErrorException: ORA-00904: : invalid identifier'".

JDBC Driver Classes

The JDBC driver class name for Oracle 8 is oracle.jdbc.driver.OracleDriver, and the JDBC driver class name for Oracle 9 is oracle.jdbc.OracleDriver. Be careful not to confuse the two. Using the Oracle 8 driver to connect to an Oracle 9 database might result in a "ORA-01000: maximum open cursors exceeded" error.

Workflow Size Restrictions

Due to limitations with the Oracle JDBC drivers, a Flux workflow may not contain more than 1000 actions when using an Oracle database. If you attempt to create or use a workflow with over 1000 actions, you will likely encounter the error "java.sql.SQLException: ORA-01795: maximum number of expressions in a list is 1000".

Oracle LOBs

By default, Flux uses standard JDBC APIs to persist large binary and character data to the Oracle database. Due to an Oracle limitation, standard JDBC cannot persist more than 4k bytes of data - however, using an Oracle-specific API, it is possible to overcome this Oracle limitation.

The ORACLE_LARGE_OBJECT_ADAPTER engine configuration setting allows you to specify an adapter class that uses database- and driver-specific APIs to overcome this Oracle limitation. The adapter class you use must implement the flux.OracleLargeObjectAdapter interface.

Several adapter classes are provided by default, providing support for Apache DBCP, JBoss, Tomcat, WebLogic, WebSphere, or directly using the Oracle JDBC drivers. These adapters only work when an Oracle driver, supplied by Oracle Corporation, is used direct or indirectly (wrapped by the container's driver). You can set these adapters as follows:

Apache DBCP:

ORACLE_LARGE_OBJECT_ADAPTER=flux.ApacheDbcpOracleAdapter

JBoss:

ORACLE_LARGE_OBJECT_ADAPTER=flux.JBossOracleAdapter

Tomcat:

ORACLE_LARGE_OBJECT_ADAPTER=flux.TomcatOracleAdapter

WebLogic:

ORACLE_LARGE_OBJECT_ADAPTER=flux.WebLogicOracleAdapter

WebSphere:

ORACLE_LARGE_OBJECT_ADAPTER=flux.WebSphereOracleAdapter

Oracle:

ORACLE_LARGE_OBJECT_ADAPTER=flux.OracleAdapter

You can also provide your own adapter. It must implement the flux.OracleLargeObjectAdapter interface. Your custom adapter can be used to support non-Oracle-Corporation drivers or other application servers that provide JDBC drivers that wrap Oracle JDBC drivers.

If you need to overcome this Oracle limitation in a situation where a built-in adapter is unavailable or you cannot create one, the workaround is to create a second data source in your applications server that directly uses an Oracle JDBC driver supplied by Oracle Corporation. This allows you use the built-in adapter flux.OracleAdapter.

Connections, Sessions, and Processes

A connection is a physical connection to the database. When you set the ?MAX_CONNECTIONS parameter in Flux, you are specifically limiting the number of connections to the databas.

A session, on the other hand, is the "conversation" between the client and the database. There is not necessarily a 1:1 ratio from connections to sessions. Although it is possible for a single connection to open multiple sessions to the database, Flux will typically never open more active sessions than database connections (the maximum value defined by MAX_CONNECTIONS). Because this is technically possible on the database, however, so you may need to decrease the MAX_CONNECTIONS value (and the CONCURRENCY_THROTTLE accordingly) if you notice that Flux is approaching or exceeding the maximum number of sessions allowed by your database.
 
A process is the actual process on the Operating System when the database runs a command. A single connection in Flux should never require more than one process at a time.
 
For these reasons, it should be safe to use any MAX_CONNECTIONS value as long as it is smaller than both the max sessions and processes allowed on the database itself.
 
Visit the following links for more information:

http://www.disperu.com/oracle-parameter-setting-session-and-the-processes/

http://asktom.oracle.com/pls/asktom/f?p=100:12:0::NO:\\:P12_QUESTION_ID,P12_PREV_PAGE:5671284058977,11 (http://asktom.oracle.com/pls/asktom/f?p=100:12:0::NO:\:P12_QUESTION_ID,P12_PREV_PAGE:5671284058977,11)

DB2

Increasing Performance

In DB2 7.2 or greater, you can increase performance and reduce the probability of deadlocks by setting the DB2_RR_TO_RS flag to true. This setting refers to the DB2 feature "next key locking". You can find more about next key locking in DB2 on the IBM website.

Troubleshooting

  • The Exception " flux.EngineException: com.ibm.db2.jcc.c.SqlException: DB2 SQL error: SQLCODE: -911, SQLSTATE: 40001, SQLERRMC: 2" may be thrown as a result of the transaction log being full. To correct this issue, increase the value of the DB2 instance's "LOGFILSIZ" configuration parameter to accomodate a larger log. If storage size is an issue, adjust the "LOGPRIMARY" and "LOGSECOND" parameters to reduce the number of logs that are saved on the system.
  • The SQLException “DB2 SQL error: SQLCODE: -911, SQLSTATE: 40001, SQLERRMC: 2” may be thrown from the DB2 JDBC driver if the maximum number of database locks is reached. To correct this issue, increase the value of the DB2 instance’s LOCKLIST and MAXLOCKS configuration parameters. LOCKLIST determines the amount of storage that is allocated to the lock list, and MAXLOCKS determines what percentage of the lock list can be used by an application before lock escalation is performed.

For more information on the LOCKLIST and MAXLOCKS configuration parameters, see IBM's DB2 Tuning Tips documentation.

MySQL

MySQL Performance

MySQL provides more robust storage and performance than the Derby database Flux uses by default. MySQL is generally recommended for light to medium loads; under heavy loads, MySQL is known to encounter deadlocks that may impact performance or execution on the engine. For this reason it is recommended to use another database server if your Flux engine is expected to be placed under heavy loads.

BLOB Storage Limitations

The MySQL BLOB database column has a storage limit of 64 KB. You might want to use the MEDIUMBLOB or LONGBLOB MySQL database types to store larger binary variables.

JDBC Driver

Use the MySQL Connector/J JDBC driver, version 5 or newer. The driver class name is com.mysql.jdbc.Driver.

Transactional Tables (InnoDB)

Flux requires MySQL’s transactional tables. These tables are called "InnoDB". For Flux to work correctly with MySQL, all database tables used by Flux must be of type "InnoDB".

Transaction Isolation Level

MySQL's default transaction isolation level is REPEATABLE_READ. This can cause a large number of database deadlocks in Flux. Flux requires the READ_COMMITTED transaction isolation level, which is less strict and allows higher levels of performance. To configure MySQL to use the READ_COMMITTED transaction isolation level, add the following line to your MySQL my.ini file. 

transaction-isolation=READ-COMMITTED

Flux will report an issue in its logs if this parameter is not set and Flux will not allow any workflow processing to take place.

To change the parameter on an instance of MySQL on AWS, create a Parameter Group in your RDS Console, setting the desired parameters and then assigning the Parameter Group to your RDS Instance.

For further information on this, please see the AWS Documentation on above mentioned:
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html

java.sql.SQLException: Incorrect string value: '\xEF\xBF\xBD2,7...' for column 'CONTENT' at row 1

A SQLException may be thrown when persisting Flux variable data that contains unicode character data in MySQL. To fix this, you need to modify the character set and collation to utf8mb4 on the CONTENT column in FLUX_VARIABLE table.

The default collation in MySQL is set to latin1_swedish_ci. You can find the collation using this query.

show table status like 'FLUX_VARIABLE';

To change the charset and collation on the variable table, you need to run this query.

ALTER TABLE FLUX_VARIABLE DEFAULT CHARACTER SET utf8mb4, MODIFY CONTENT LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

After running this query, you should see the collation set to utf8mb4_general_ci for the table when running the status query on FLUX_VARIABLE table. Before making this change, you need to take a backup of your MySQL Flux database.

SQL Server

Flux requires the READ COMMITTED transaction isolation level. You can enable this on your database by running the query:

ALTER DATABASE DATABASE_NAME SET READ_COMMITTED_SNAPSHOT ON;

To determine if this setting is enabled, run the following query. A return value of '1' indicates the feature is enabled.

SELECT is_read_committed_snapshot_on FROM sys.databases WHERE name= 'YourFluxDatabaseName';


Database Properties

You can configure the Flux engine to use a database properties file, which provides you with greater control over how Flux accesses and uses the database. To use a database properties file, just set the DATABASE_PROPERTIES engine configuration option like so:

DATABASE_PROPERTIES=/path/to/myfile

This file must be a standard Java properties file (and thus conform to Java properties file syntax).

Database Properties Summary

The database properties that can be configured are:

  • table renamings
  • column renamings
  • SQL type remappings
  • database connection initializations

The syntax for setting these properties is outlined in the table below.

Property Key

Property Value

Description

table.name

new_table_name

table is the recognized name of a Flux table.

new_table_name is the new name that you are assigning to the table.

The String ".name" is literal and must be included in the key.

Any table can be renamed. You may choose to rename tables if the standard table names are too long for your database, or if you need to adhere to a table naming standard.

As an example, the key "FLOW_CHART.NAME" with the value "MY_FLOW_CHART" will rename the FLOW_CHART table to MY_FLOW_CHART - so any time Flux would normally use the FLOW_CHART table, it will now use MY_FLOW_CHART instead.

When renaming tables, do not include the prefix (like "FLUX_") in either the key or the value. The table prefix is automatically applied internally.

table.column.name

new_column_name

table is the recognized name of a Flux table. column is the recognized name of a Flux column.

new_table_name is the new name that you are assigning to the column.

Note that table is the original Flux table name - if you have renamed the table in the database properties, you will still need to use the original table name here.

The String ".name" is literal and must be included in the key.

As an example, the key "FLOW_CHART.PK.NAME" with the value "MY_PK" will rename the PK column in the FLOW_CHART table to MY_PK.

sql_type.type

new_sql_type

sql_type is a recognized SQL type (such as BIGINT).

new_sql_type is the recognized SQL type that should be used instead of sql_type (such as DECIMAL).

The String ".type" is literal and must be included in the key.

connection_init

sql

sql is an SQL statement that should be executed on a database connection when it is first created.

Does not apply to connections obtained from a data source.

The entire string "connection_init" is literal and must be included in the key.

connection_init.N

sql

sql is an SQL statement that is executed on a database connection when it is first created.

Does not apply to connections obtained from a data source.

N is a number, greater than one. Database connection initialization statements are executed in numerical order (so connection_init.2 is executed before connection_init.3, etc.)

The string "connection_init." is literal and must be included in the key.

Initializing Database Connections

As described in #Database Properties Summary above, you can execute custom SQL initialization statements on JDBC connections as they are created.

To initialize a JDBC connection with custom SQL, set the following property in your database properties file:

CONNECTION_INIT=<initializing SQL statement>

If more than one initialization SQL statement is needed, you can create additional properties:

CONNECTION_INIT.2=this SQL initialization statement runs second
CONNECTION_INIT.3=this SQL initialization statement runs third

You can set as many initialization statements as you require.

Database Indexes

The SQL scripts included with Flux already contain the recommend database indexes. If you have created your database tables as recommended, there should be no need to create additional indexes.

Notes

DB2

Appropriate database indexes can not only increase database performance, but especially in the DB2 database, they can also reduce the chances of deadlock.

In addition to database indexes, the database script for DB2 includes statements to flag certain tables as "VOLATILE", which can improve performance and reduce the opportunity for deadlock on DB2 systems.

Database Failures

Sometimes databases fail. Sometimes they crash and are later restarted. Sometimes the network connection between Flux and the database goes down or is up only intermittently.

In any of these cases, Flux may be operating while the database is not available. When this happens, Flux will not be able to persist any data to the database, so it must wait until the database becomes available. When the database is available, Flux will continue operating normally.

Flux can recover from database failures very soon after the database recovers. The maximum amount of time it will take Flux to recover is governed by the SYSTEM_DELAY engine configuration property. If Flux detects that the database is not available, it will wait for the period of time specified by the system delay, then try to use the database again. If the database is available, Flux will then run normally - otherwise, it will repeat this process until the database is available.

The SYSTEM_DELAY property is specified as a time expression. For example, to set a system delay of 30 seconds, you would use:

SYSTEM_DELAY=+30s

The default system delay is +3m, or three minutes.

Application Startup

In Flux, it is not possible to instantiate an engine before your database is available. Flux always assumes that all necessary database tables have already been created. In production settings, it is typical that all database tables have been created ahead of time.

  • No labels