Written by

Developer - Internal Applications at InterSystems
Question Pravin Barton · Jan 31, 2018

How to assign unique IDs to rows in the product of a table join

This might be more of a math problem than a Caché question.
I have a SQL query that joins two tables. I want to assign a unique ID to each row of the product table.

  1. I could append the GUIDs of the rows in the two tables, but there are a number of clients that expect a maximum length of 50 on this unique ID. Two GUIDs appended make 72 characters.
  2. I could append the two GUIDs and then truncate the result, but now I'm worried about collision.

What's the chance of collision if I append the GUIDs and truncate the result to 50 characters? Is there a good way to solve this without updating the clients?

Comments

Jean Millette · Jan 31, 2018

How about adding an SQL method to one of the classes being joined? The method would call %SYSTEM.Util's "CreateGUID" method and the field associated with that method would be included in the join.

0
Pravin Barton  Jan 31, 2018 to Jean Millette

Good thought, but I should have mentioned that the GUID for a specific row needs to stay the same over multiple calls of the query.

0
Pravin Barton  Jan 31, 2018 to Eduard Lebedyuk

We have a table for holidays and for people. Both of these tables have a country column. Each country has a list of holidays and all the people in that country have all those holidays off. The result of the join means semantically: which people have which days off. I could create a third table for this but it would have to be updated any time a holiday or a person gets added.

Your suggestion of using a hash function is good, and I think I'll do that.

0
Jean Millette  Jan 31, 2018 to Pravin Barton

Ah, ok thanks for the feedback Pravin. Does the new GUID need to be different than the GUIDs for the records being joined? If the new GUID is not to be persisted, could you pick the GUID from one of the records being joined?

0
Pravin Barton  Jan 31, 2018 to Jean Millette

Currently we're using the GUID from one of the tables, but the problem is it's not unique anymore after the join.

0
Pravin Barton  Jan 31, 2018 to Eduard Lebedyuk

We're uniting this person+holiday table with a different table of personal time off to create a general absences table. Client applications access this table through a web service in order to sync a schedule. They need a GUID on each absence entry so they know what needs to be updated. For example, if a holiday changes there's an absence for each person in that country, and the client needs to update each of those entries.

We're only sending across the absences that have been updated since the last sync, so the client can't just rebuild the whole schedule every time.

0
Pravin Barton  Jan 31, 2018 to Robert Cemper

Thanks, that might work. Is there any danger of the ID from CachéStorage getting reused if an entry gets deleted?

0
Robert Cemper  Jan 31, 2018 to Pravin Barton

NO NEVER!
ID in CacheStorage is an "autoincrement" positive Integer (1...2**64-1)    ~19 digits decimal.

AND is never reused.
And you can split this construct in the original pieces if required

0
Eduard Lebedyuk  Jan 31, 2018 to Pravin Barton

Okay, what does GUID means in your case? I get how people have GUIDs and holidays have GUIDs, but the purpose for the GUID of a person+holiday combo escapes me.

0
Robert Cemper  Jan 31, 2018 to Pravin Barton

if this is something like

SELECT people.name, holiday.date, people.ID||'^'||holiday.ID as UNIQ
FROM people JOIN holiday
ON people.country = holiday.country

Then the bolded expression should not be longer than your 50 Char.
IF you use CachéStrorage

0
Eduard Lebedyuk · Jan 31, 2018

Can you elaborate on your data model? What are your two tables, and what information joining them  generates.

Consider the following database: it has clients and products -and each client and each product has a guid.

The join between clients and products would mean semantically - what client bought which products.

But it's probably be better to store this information in another table - orders and just add properties/fk/relationships to clients and products.

You want GUIDs - a mark of persistency,  but you want them in a transient query. I think it would be better to create another table and populate it with the relevant data and new  GUIDs and return that new GUIDs.

Another approach would be exposing hash function as an sql procedure and passing both GUIDs into it and returning a hash to a client.

0
Robert Cemper · Jan 31, 2018

rewrite of comment:

if this is something like

SELECT people.name, holiday.date, people.ID||'^'||holiday.ID as UNIQ
FROM people JOIN holiday
ON people.country = holiday.country

Then the bolded expression should not be longer than your 50 Char.

IF you use CachéStorage
ID is always a unique positive Integer (1.... 19digits)  and you can always disassemble it by the separator.

0