Skip to main content

FORALLs and COMMITs

Received this note today from Debbie B:

I attended your Turbo Charge PL/SQL seminar today. I still have a question about where to put COMMIT. I’m using 10g. See (pseudo) code below.

Say I have 50,000 records and the LIMIT is 100. If an exception is thrown:
  • Do I need a COMMIT in the WHEN clauses so successful DML gets committed, then loop processing will continue?
  • I was thinking each DML needed it’s own BEGIN EXCEPTION END block so I would know if the error happened due to the insert or the update and could log the appropriate error. Is this wrong?

  OPEN v_cursor;
  LOOP
    FETCH v_cursor BULK COLLECT INTO data_array LIMIT i_limit;
    EXIT WHEN data_array.COUNT = 0;

    BEGIN
      FORALL i IN 1.. data_array.COUNT SAVE EXCEPTIONS
        INSERT INTO some_table VALUES data_array (i);
        COMMIT;
    EXCEPTION
        WHEN dml_errors THEN
             log_error;
             COMMIT;
             /* Don't RAISE, so execution will continue */
        WHEN OTHERS
             log_error;
             COMMIT;
             /* Don't RAISE, so execution will continue */
    END;

    BEGIN
      FORALL i IN 1.. data_array.COUNT SAVE EXCEPTIONS
        UPDATE some_table SET…
        COMMIT;
    EXCEPTION
        WHEN dml_errors THEN
             log_error;
             COMMIT;
             /* Don't RAISE, so execution will continue */
        WHEN OTHERS
             log_error;
             COMMIT;
             /* Don't RAISE, so execution will continue */
    END
  END LOOP;
  COMMIT;
  CLOSE v_cursor;

So many commits, so little time. 

Before addressing the specifics of this code, let's consider a more general question:

When and where should I put commits in my code?

Answer: How the heck should I know? Or, perhaps a little more politely: What is your transaction?

When you commit, you are really saying: I have finished making all changes needed for this transaction. Now it's time to save them, all together, all at once.

When we write back-end code that acts as an API to front end (user-driven) code, we usually don't have any commits. That's because we leave it up to the user to decide when they are done and ready to save their changes.

When we write code implementing backend processes not controlled by the user, then we do usually need to explicitly commit (and rollback). But again the primary question is very application-specific: what constitutes a transaction? 

Of course, the real world is messy - even when we clean it up, shear off rough edges, and make that real world fit into our cyber world.

For example, sometimes we have to do "incremental commits" - save changes to N rows at a time, to avoid rollback segment errors.

That may be the reason that Debbie has put COMMIT statements after each FORALL and after each logging of an error. 

But do you need a COMMIT in that exception section to ensure that successfully executed statements are not rolled back? NO! The failure of a SQL statement in your block will not force the rollback of previously completed statements.

As for putting each FORALL inside its own block, sure, that makes a lot of sense - if you want to make sure that both FORALLs always execute for each iteration of the loop. I would, however, put each inside its own nested subprogram (see my rewritten version of your code below).

I probably would not, however, use the same exceptional handling strategy for ORA-24381 (which your code implies was associated with the dml_errors exception) and for WHEN OTHERS. If something else went wrong in FORALL execution that was not caught by SAVE EXCEPTIONS, you should consider it "catastrophic" and stop the processing.

And, by the way, so far as I can tell, that final COMMIT before the CLOSE statement? Totally unnecessary - if you keep all those other COMMITs. You've already covered every path out of the loop!

OK, here's my offering of an alternative implementation. It's still kinda pseudo-codish because I lack the full details of your requirements, but:
  • Just one COMMIT at the end
  • Nested subprograms for each FORALL
  • No handling of unanticipated errors.

IS
   failure_in_forall   EXCEPTION;
   PRAGMA EXCEPTION_INIT (failure_in_forall, -24381);

   PROCEDURE bulk_inserts (data_array_in your_type)
   IS
   BEGIN
      FORALL i IN 1 .. data_array_in.COUNT SAVE EXCEPTIONS
         INSERT INTO some_table
              VALUES data_array_in (i);
   EXCEPTION
      WHEN failure_in_forall
      THEN
         log_error;
   END;

   PROCEDURE bulk_updates (data_array_in your_type)
   IS
   BEGIN
      FORALL i IN 1 .. data_array.COUNT SAVE EXCEPTIONS
         UPDATE some_table
            SET x = data_array (i);
   EXCEPTION
      WHEN failure_in_forall
      THEN
         log_error;
   END;
BEGIN
   OPEN v_cursor;

   LOOP
      FETCH v_cursor BULK COLLECT INTO data_array LIMIT i_limit;

      EXIT WHEN data_array.COUNT = 0;

      bulk_inserts (data_array);
      bulk_updates (data_array);
   END LOOP;

   CLOSE v_cursor;

   COMMIT;
END;



Comments

  1. Hello Steven, a little error inside of the procedure 'bulk_updates' - I think it should be "SET x = data_array (i);" and not "SET x = data_array (1);".
    Kind regards, Niels

    ReplyDelete
  2. Thanks, Niels. Surely one of the biggest mistakes every made by font designers was to create a font in which the number ONE and the letter EL and the letter EYE could look so much alike.

    ReplyDelete

Post a Comment

Popular posts from this blog

Quick Guide to User-Defined Types in Oracle PL/SQL

A Twitter follower recently asked for more information on user-defined types in the PL/SQL language, and I figured the best way to answer is to offer up this blog post. PL/SQL is a strongly-typed language . Before you can work with a variable or constant, it must be declared with a type (yes, PL/SQL also supports lots of implicit conversions from one type to another, but still, everything must be declared with a type). PL/SQL offers a wide array of pre-defined data types , both in the language natively (such as VARCHAR2, PLS_INTEGER, BOOLEAN, etc.) and in a variety of supplied packages (e.g., the NUMBER_TABLE collection type in the DBMS_SQL package). Data types in PL/SQL can be scalars, such as strings and numbers, or composite (consisting of one or more scalars), such as record types, collection types and object types. You can't really declare your own "user-defined" scalars, though you can define subtypes  from those scalars, which can be very helpful from the p

The differences between deterministic and result cache features

 EVERY once in a while, a developer gets in touch with a question like this: I am confused about the exact difference between deterministic and result_cache. Do they have different application use cases? I have used deterministic feature in many functions which retrieve data from some lookup tables. Is it essential to replace these 'deterministic' key words with 'result_cache'?  So I thought I'd write a post about the differences between these two features. But first, let's make sure we all understand what it means for a function to be  deterministic. From Wikipedia : In computer science, a deterministic algorithm is an algorithm which, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states.  Another way of putting this is that a deterministic subprogram (procedure or function) has no side-effects. If you pass a certain set of arguments for the parameters, you will always get

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch,