Amend Firebird 'unique ID' by date programmatically rather than manually
August 9, 2022 1:34 AM   Subscribe

I have a Firebird DB that I sometimes need to amend dates/times on. Basically it's a timeclock system that I've asked about twice before (1) and (2). The system works brilliantly, with the caveat that if someone forgets to clock out, it can be a real pain to insert a row to tell the system they've clocked out. The unique IDs must be sequential and incrementing in order for the system not to throw a wobbly*

If a lot of events have elapsed between the non-clock out being noticed, what I have to do is:-

Using SQL Workbench,
Insert a new row for the clock out event
Give all the records a temporary unique ID that won't conflict with the about to be amended record
Give the new row a sequential unique ID
Then give the remaining original rows (which currently have a temporary ID) with a sequential unique ID

It's a manual and quite labour intensive process.

I wondered if there was any way to automate it?

Many thanks!!


* Would not be opposed to amending the node logic so it doesn't require the unique IDs to be incremental in order for it to not fall over instead....might be a lot easier?!
posted by dance to Computers & Internet (13 answers total)
 
Why can't you add the new row with the next sequential id? Are the ids linked to the clock out times directly? Normally I'd expect you'd just add the new row, but the clock out event time would be in the past to reflect the actual events.
posted by crocomancer at 3:02 AM on August 9, 2022


You need to fix your application logic so that it can handle non-sequential keys. Anything else is just going to be layering bandaids on bandaids.
posted by rockindata at 4:13 AM on August 9, 2022 [1 favorite]


Another option would be to automate the fix process that currently takes a bunch of manual effort. All the things you need to do could be bundled up into a script fairly easily.

By the way, the reason things are breaking is probably that the app does something like sorting by primary key instead of sorting by event time, which means that if last nights sign out has a key that is after todays sign in, throngs get weird.
posted by rockindata at 5:19 AM on August 9, 2022


You said that the unique IDs need to be sequential, but not that the sequence needs to be gapless. So, one bad workaround that’s possibly better than your current bad workaround would be to allocate only every (say) 10th sequence value when your application inserts rows in the normal way. Then you have nine intervening values rouse for later manual inserts.

But I agree with rockindata that the real problem here is that the application uses unique IDs for the wrong reason.
posted by rd45 at 5:36 AM on August 9, 2022 [1 favorite]


Exactly what rockindata and rd45 said. The only legitimate purposes of a unique ID are (a) to be unique and (b) to join tables on.

If you need to order stuff by time then order it by time, not by some ID that usually happens to be ordered the same way time is.

Auto-incrementing integer IDs get used because they cost less to store than random GUIDs, but if your application logic actually cares which of those you're using then your application logic is looking at your DB wrong.
posted by flabdablet at 5:51 AM on August 9, 2022 [3 favorites]


A principle well worth wrapping one's head around as a budding database designer is that a database table is not a spreadsheet. A database table represents a set of records, not a list of records, so the order that rows are stored inside it must not matter for data modelling purposes. It might matter for performance optimization, but performance optimizations must never change the meaning of the underlying data.

If your application needs to process the rows of any query result in some specific order (by event time, for example) then the query needs to include a specific ORDER BY clause to make that happen.

It is never safe to assume that the rows of a database query result where the query included no ORDER BY clause are in fact ordered in a way that makes any kind of sense. And doing ORDER BY on a surrogate key is just broken; don't do that.
posted by flabdablet at 7:13 AM on August 9, 2022 [1 favorite]


If you can you should redesign it where each row has a clock in and clock out time. Each row should represent a work timeperiod.
posted by The_Vegetables at 7:31 AM on August 9, 2022


There would certainly need to be a table for work timeperiods, but it would be completely reasonable for that to contain pairs of timeclock event IDs rather than pairs of specific times, especially given that this system allows clocking in and out to be done from multiple locations.

Again, those timeclock event IDs should be meaningless except insofar as they unambiguously identify rows in a timeclock events table, which could usefully also record details about how and where and perhaps even when the associated event was added to the DB and who added it.

It seems to me like a good idea to allow the administrative correction of late or missing clock-in and clock-out events to be explicitly modelled in the database and achievable via the application, including the possibility of having a proper audit trail for such after-the-fact events, rather than requiring by-hand database fuckery every time one happens.
posted by flabdablet at 7:52 AM on August 9, 2022 [1 favorite]


Yeah by-hand database fuckery is never the right answer.

In a good database design, each entity should have its own table. I could certainly imagine a data model that recognises shift and clock_event as two different entities. Each entity has a bunch of attributes, and each attribute should map to a column in the relevant table. So, the shift table might have columns for starting_clock_event_id and ending_clock_event_id - each of which has a foreign key relationship with the clock_events table - which means that both those columns are populated with unique IDs from the clock_events table. To figure out the actual start or end time of a shift, you’d join those tables with a condition like “shifts.starting_clock_event_id = clock_events.event_id”.

To handle the audit requirement for later ad hoc data fixes, you can add columns like last_updated_by, last_updated_time and last_updated_reason to the clock_events table - then populate them from your application.
posted by rd45 at 8:12 AM on August 9, 2022 [1 favorite]


Again, those timeclock event IDs should be meaningless except insofar as they unambiguously identify rows in a timeclock events table, which could usefully also record details about how and where and perhaps even when the associated event was added to the DB and who added it.

IDK, it seems like if each event is independent, then you are adding use cases like 'X' person is able to clock in multiple concurrent times, or you have to add endless amounts of backend coding to join and close entries, which seems far beyond the scope of a basic time entry system. The multiple locations can be dealt with via UTC or some other basic time accounting.
posted by The_Vegetables at 8:28 AM on August 9, 2022


A clock event would also want an event_type attribute. Initially that could be as simple as a single bit to distinguish clock-ons from clock-offs, but at some point it might be extended to include stuff like, I dunno, building-caught-fire-at or whatnot.
posted by flabdablet at 8:29 AM on August 9, 2022


you are adding use cases like 'X' person is able to clock in multiple concurrent times

I've always taken the view that good database design needs to be able to model everything that the entities being modelled might reasonably be expected to do, so that when people find the truly creative ways to fuck things up that people always will, the database can just record the fuckups and let the app decide how to handle them. So, not so much adding use cases as eliminating unnecessary assumptions in order to achieve generality that facilitates robustness in the face of misuse cases.

But this is all rather tangential to the main point at hand here, which is the absolute undesirability of having the system design attach any importance whatsoever to the numeric ordering of surrogate keys.
posted by flabdablet at 8:37 AM on August 9, 2022


by-hand database fuckery is never the right answer

Apart from anything else it can get very expensive.
posted by flabdablet at 8:53 AM on August 9, 2022


« Older UK-EU dual national. Worth getting the EU passport...   |   Help me stay off my phone during work Newer »
This thread is closed to new comments.