Database Reference
In-Depth Information
238 rows updated
5730 rows updated
1428 rows updated
...
262 rows updated
1687 rows updated
PL/SQL procedure successfully completed.
Now, we could restart this process if it fails, since we would not process any object name that had already been
processed successfully. The problem with this approach, however, is that unless we have some attribute that evenly
partitions the data, we will end up having a very wide distribution of rows. The second UPDATE did more work than all of
the others combined. Additionally, if other sessions are accessing this table and modifying the data, they might update the
OBJECT_NAME field as well. Suppose that some other session updates the object named Z to be A , after we already processed
the A s. We would miss that record. Furthermore, this is a very inefficient process compared to UPDATE T SET LAST_DDL_
TIME = LAST_DDL_TIME+1 . We are probably using an index to read every row in the table, or we are full-scanning it n times,
both of which are undesirable. There are so many bad things to be said about this approach.
The best approach is to do it simply. If it can be done in SQL, do it in SQL. What can't be done in SQL, do in PL/SQL.
Do it using the least amount of code you can. Have sufficient resources allocated. Always think about what happens in
the event of an error. So many times, I've seen people code update loops that worked great on the test data but then failed
halfway through when applied to the real data. Then they are really stuck, as they have no idea where the loop stopped
processing. It's a lot easier to size undo correctly than to write a restartable program. If you have truly large tables that
need to be updated, you should be using partitions, which you can update each individually. You can even use parallel
DML to perform the update, or in Oracle11 g Release 2 and above, the DBMS_PARALLEL_EXECUTE package.
Using Autocommit
My final words on bad transaction habits concern the one that arises from using the popular programming APIs
ODBC and JDBC. These APIs “autocommit” by default. Consider the following statements, which transfer $1,000 from
a checking account to a savings account:
update accounts set balance = balance - 1000 where account_id = 123;
update accounts set balance = balance + 1000 where account_id = 456;
If your program is using ODBC or JDBC when you submit these statements, they (silently) inject a commit after
each UPDATE . Consider the impact of this if the system fails after the first UPDATE and before the second. You've just
lost $1,000!
I can sort of understand why ODBC does this. The developers of SQL Server designed ODBC, and this database
demands that you use very short transactions due to its concurrency model (writes block reads, reads block writes,
and locks are a scarce resource). What I can't understand is how this got carried over into JDBC, an API that is
supposed to support “the enterprise.” It is my belief that the very next line of code after opening a connection in JDBC
should always be this:
Connection conn = DriverManager.getConnection
("jdbc:oracle:oci:@database","scott","tiger");
conn.setAutoCommit (false);
This returns control over the transaction back to you, the developer, which is where it belongs. You can then
safely code your account transfer transaction and commit it after both statements have succeeded. Lack of knowledge
of your API can be deadly in this case. I've seen more than one developer unaware of this autocommit “feature” get
into big trouble with his application when an error occurred.
 
Search WWH ::




Custom Search