Reverse Engineering a Data ModelUsing the Oracle Data Dictionary
by Eve Andersson
This article shows you how to use the Oracle Data Dictionary to obtain:
The ProblemLet's say you have to work with an Oracle database, using a data model that somebody else wrote. Maybe you're extending the data model or building an application that references it. There's only one problem: whomever created the original data model left without writing a line of documentation.
What do you do? How to you reverse engineer the data model to unearth the table definitions, constraints, indexes, views, sequences, triggers, and PL/SQL functions and procedures?
This ends up being an easy task if you use the Oracle data dictionary.
The Oracle Data DictionaryJust like you use Oracle tables to store your data, Oracle uses tables to store its data. A set of tables, called the Oracle data dictionary, contains information about all the structures (tables, views, etc.) and procedural code (triggers, PL/SQL procedures, etc.) created by each user.
For example, there's a table called USER_TAB_COLUMNS that contains information about all the columns you've defined, including: what table the column belongs to, the data type (number, varchar, etc.), what the default value is, whether the column can be null, etc.
The Oracle data dictionary is huge and contains a lot of esoteric stuff, but when you whittle it down to only the info you need, it's not so menacing. Here are the data dictionary tables I find useful. You can do SELECTs on them, just as you would any other table in Oracle:
All of the above tables (the USER_* tables) only contain objects defined by the current Oracle user. Oracle also maintains a set of tables of identical structure that start with ALL_*. These show you every object that you have access to, regardless of whether you created that object (e.g., our beloved friend DUAL). Similarly, Oracle provides DBA_* tables that contain info about all users' objects, but this group of tables is only accessible to the database administrator.
Find out what tables have been defined in your system:
select TABLE_NAME from USER_TABLES
TABLE_NAME is really the only important info we can get from Oracle's data dictionary table USER_TABLES. When tables are created, most of the action takes place in the definition of individual columns, which we'll look at later.
For example, if you have four tables defined in your system, your query will return four rows:
TABLE_NAME EMPLOYEES OFFICES SOFTBALL_TEAMS EMPLOYEES_AUDIT
For each table, get any comments written by the data model author:
Note that the TABLE_NAME must be written in all uppercase letters.select COMMENTS from USER_TAB_COMMENTS where TABLE_NAME = 'TABLE_NAME' and COMMENTS is not null
Example: if we do this query for the EMPLOYEES table, we find the following comment:
|This is a table to hold all current, past, and future employees. Application developers might find the views EMPLOYEES_CURRENT, EMPLOYEES_PAST and EMPLOYEES_FUTURE useful.|
In my experience, very few developers document their tables within Oracle (if the tables are documented, the documentation is generally done in some file somewhere else). But if you want to be a conscientious developer and ensure that your comments show up in the data dictionary for future programmers to find, you can use the command:
comment on table TABLE_NAME is 'This is my comment.'
If you only want basic info about each column (name, type, and whether it's nullable), the easiest way to get it is to DESCRIBE the table (or DESC, for short). Let's see what columns the EMPLOYEES table contains:
SQL> desc employees; Name Null? Type ----------------------------------------- -------- ---------------------------- EMPLOYEE_ID NOT NULL NUMBER(38) LAST_NAME NOT NULL VARCHAR2(200) FIRST_NAME NOT NULL VARCHAR2(200) EMAIL VARCHAR2(100) PRIMARY_OFFICE_ID NOT NULL NUMBER(38) START_DATE NOT NULL DATE END_DATE DATE SALARY NUMBER(9,2) YEARS_EXPERIENCE NUMBER MANAGEMENT_TRACK_P CHAR(1) SHORT_BIO VARCHAR2(4000) LIFE_STORY CLOB PHOTO BLOB
But if you want more detailed -- and parseable -- information about your tables, you will have to query from the data dictionary. Here's how we get the column info (note: this does not include the comments, constraints, and indexes, which are stored elsewhere in the data dictionary):
select COLUMN_NAME, DATA_TYPE, DATA_LENGTH, DATA_PRECISION, DATA_SCALE, NULLABLE, DATA_DEFAULT from USER_TAB_COLUMNS where TABLE_NAME = 'TABLE_NAME'
For example, if we do the above query for the EMPLOYEES table, we get back the following results:
Useful facts for deciphering the above:
Based on this, we can derive the following table definition:
create table eve_employees ( employee_id integer not null, last_name varchar(200) not null, first_name varchar(200) not null, email varchar(100), primary_office_id integer not null, start_date date default sysdate not null, end_date date, salary number(9,2), years_experience number, management_track_p char(1) default 'f', short_bio varchar(4000), life_story clob, photo blob );
Note that we still haven't looked up any constraints, indexes, or column comments.
select COLUMN_NAME, COMMENTS from USER_COL_COMMENTS where TABLE_NAME = 'TABLE_NAME'
The EMPLOYEES table has two columns with comments:
|PRIMARY_OFFICE_ID||The office that the employee spends most of their time in.|
|MANAGEMENT_TRACK_P||Has the employee expressed a desire and aptitude for management training?|
Note that if you want to put comments into the data dictionary for future programmers to find, you can use the following syntax:
comment on column TABLE_NAME.COLUMN_NAME is 'This is my comment.'
For the EMPLOYEES table, we get:select UCC.CONSTRAINT_NAME, UCC.COLUMN_NAME, UC.CONSTRAINT_TYPE, UC.SEARCH_CONDITION, UC2.TABLE_NAME as REFERENCES_TABLE from USER_CONS_COLUMNS UCC, USER_CONSTRAINTS UC, USER_CONSTRAINTS UC2 where UCC.CONSTRAINT_NAME = UC.CONSTRAINT_NAME and UC.R_CONSTRAINT_NAME = UC2.CONSTRAINT_NAME(+) and UCC.TABLE_NAME = 'TABLE_NAME' order by UCC.CONSTRAINT_NAME
|SYS_C0057015||LAST_NAME||C||"LAST_NAME" IS NOT NULL|
|SYS_C0057016||FIRST_NAME||C||"FIRST_NAME" IS NOT NULL|
|SYS_C0057017||PRIMARY_OFFICE_ID||C||"PRIMARY_OFFICE_ID" IS NOT NULL|
|SYS_C0057018||START_DATE||C||"START_DATE" IS NOT NULL|
|SYS_C0057019||MANAGEMENT_TRACK_P||C||management_track_p in ('t','f')|
|SYS_C0057020||SHORT_BIO||C||short_bio is not null or life_story is not null|
|SYS_C0057020||LIFE_STORY||C||short_bio is not null or life_story is not null|
There are four types of constraint:
Note that the constraint SYS_C0057020 appears twice above; this is because it is a multi-column constraint. Note also that the "not null" constraints appear here even though they also appear in USER_TAB_COLUMNS (a little redundancy).
Based on the information we have so far, we can document the table as follows:
-- This is a table to hold all current, past, and future employees. Application -- developers might find the views EMPLOYEES_CURRENT, EMPLOYEES_PAST and -- EMPLOYEES_FUTURE useful. create table employees ( employee_id integer primary key, last_name varchar(200) not null, first_name varchar(200) not null, email varchar(100) unique, -- The office that the employee spends most of their time in. primary_office_id not null references offices, start_date date default sysdate not null, end_date date, salary number(9,2), years_experience number, -- Has the employee expressed a desire and aptitude for management training? management_track_p char(1) default 'f' check(management_track_p in ('t','f')), short_bio varchar(4000), life_story clob, photo blob, check(short_bio is not null or life_story is not null) );
SELECT INDEX_NAME, COLUMN_NAME FROM USER_IND_COLUMNS WHERE TABLE_NAME='TABLE_NAME' ORDER BY INDEX_NAME
The indexes on EMPLOYEES:
EMPLOYEE_DATES_IDX appears twice because it is a multi-column index. Oracle automatically created the index on EMPLOYEE_ID because it is a primary key. Oracle automatically created the index on EMAIL because that column has a unique constraint.
From this, we can see that the original index definitions were:
create index employee_dates_idx on employees(start_date, end_date); create index employee_ye_idx on employees(years_experience);
select UV.VIEW_NAME, UV.TEXT, UTC.COMMENTS from USER_VIEWS UV, USER_TAB_COMMENTS UTC where UV.VIEW_NAME = UTC.TABLE_NAME(+)
In our example data model, we have the following views defined:
where start_date <= sysdate
and end_date >= sysdate
|All employees who've already started working here and who have not yet ended their employment.|
where start_date > sysdate
where end_date < sysdate
where state_or_province in ('CA','WA','OR','HI','AZ')
Based on this, we know that the view OFFICES_REGION_I was created with the following statement:
create or replace view OFFICES_REGION_I as select "OFFICE_ID","OFFICE_NAME","STATE_OR_PROVINCE" from offices where state_or_province in ('CA','WA','OR','HI','AZ')
select SEQUENCE_NAME, MIN_VALUE, MAX_VALUE, INCREMENT_BY, CYCLE_FLAG, ORDER_FLAG, CACHE_SIZE from USER_SEQUENCES
Here are the sequences in our example system:
Let's decipher these values. All of the values in the EMPLOYEE_SEQ row above are Oracle's default values, so we know it was created with the simple statement "create sequence employee_seq".
The other two sequences had optional arguments specified. We can deduce that the original sequence definitions were:
create sequence employee_seq; create sequence misc_seq increment by 2 start with 314 maxvalue 10000000000000 cycle cache 10; create sequence misc2_seq increment by -1;
As an aside, notice that the max value for EMPLOYEE_SEQ is 1.0000E+27 (or 1,000,000,000,000,000,000,000,000,000). Sometimes novice Oracle programmers feel uncomfortable using sequences to generate primary keys because they fear the sequences might "run out" of values. But even if each of the six billion people in the world orders a quadrillion items from your online store, there will still be plenty of sequence values left for their future purchases.
select TRIGGER_NAME, TRIGGER_TYPE, TRIGGERING_EVENT, TABLE_NAME, WHEN_CLAUSE, DESCRIPTION, TRIGGER_BODY from USER_TRIGGERS
In the example system, we have three triggers defined:
softball_teams_tr after insert on offices for each row
begin insert into softball_teams ( team_id, team_name ) values ( misc_seq.nextval, :new.office_name ); end;
softball_teams_update_tr after update on offices for each row
begin update softball_teams set team_name = :new.office_name where team_name = :old.office_name; end;
employees_audit_tr before update or delete on employees for each row
begin insert into employees_audit ( employee_id, last_name, first_name, email, primary_office_id, start_date, end_date, salary, years_experience, management_track_p, short_bio, life_story, photo ) values ( :old.employee_id, :old.last_name, :old.first_name, :old.email, :old.primary_office_id, :old.start_date, :old.end_date, :old.salary, :old.years_experience, :old.management_track_p, :old.short_bio, :old.life_story, :old.photo ); end;
From this, it's easy to put together the original trigger definitions, for example:
create or replace trigger softball_teams_tr after insert on offices for each row begin insert into softball_teams ( team_id, team_name ) values ( misc_seq.nextval, :new.office_name ); end; / show errors;
create or replace trigger softball_teams_update_tr after update on offices for each row when (old.office_name != new.office_name) begin update softball_teams set team_name = :new.office_name where team_name = :old.office_name; end; / show errors;
The general form is:
create or replace trigger TRIGGER_DESCRIPTION when (WHEN_CLAUSE) [leave out this line if the WHEN_CLAUSE is null] TRIGGER_BODY / show errors;
select NAME, TYPE, LINE, TEXT from USER_SOURCE order by NAME, TYPE, LINE
Our example results show that we have four PL/SQL objects defined:
|HUMAN_RESOURCES||PACKAGE||4||function add_office (|
|HUMAN_RESOURCES||PACKAGE||5||v_office_name IN varchar|
|HUMAN_RESOURCES||PACKAGE||6||) return number;|
|HUMAN_RESOURCES||PACKAGE BODY||1||package body human_resources|
|HUMAN_RESOURCES||PACKAGE BODY||4||function add_office (|
|HUMAN_RESOURCES||PACKAGE BODY||5||v_office_name IN varchar|
|HUMAN_RESOURCES||PACKAGE BODY||6||) return number is|
|HUMAN_RESOURCES||PACKAGE BODY||7||v_office_id number;|
|HUMAN_RESOURCES||PACKAGE BODY||9||select misc_seq.nextval into v_office_id from dual;|
|HUMAN_RESOURCES||PACKAGE BODY||11||insert into offices|
|HUMAN_RESOURCES||PACKAGE BODY||12||(office_id, office_name)|
|HUMAN_RESOURCES||PACKAGE BODY||14||(v_office_id, v_office_name);|
|HUMAN_RESOURCES||PACKAGE BODY||16||return v_office_id;|
|HUMAN_RESOURCES||PACKAGE BODY||17||end add_office;|
|HUMAN_RESOURCES||PACKAGE BODY||20||end human_resources;|
|SOFTBALL_TEAM_DELETE||PROCEDURE||1||procedure softball_team_delete (|
|SOFTBALL_TEAM_DELETE||PROCEDURE||2||v_team_id IN number|
|SOFTBALL_TEAM_DELETE||PROCEDURE||6||delete from softball_teams|
|SOFTBALL_TEAM_DELETE||PROCEDURE||7||where team_id = v_team_id;|
To query the source for just one PL/SQL object, do the following:
If we do this for our procedure SOFTBALL_TEAM_DELETE, we getselect TEXT from USER_SOURCE where name='OBJECT_NAME' and type='OBJECT_TYPE' order by LINE
Based on this, we know that the original procedure definition would have been:procedure softball_team_delete ( v_team_id IN number ) is begin delete from softball_teams where team_id = v_team_id; end softball_team_delete;
create or replace procedure softball_team_delete ( v_team_id IN number ) is begin delete from softball_teams where team_id = v_team_id; end softball_team_delete; / show errors;
Here is an example script. If you are running the ArsDigita Community System 3.x, you can run this script and other useful functions by downloading the small Schema and Data Browser module that I wrote. If you're using a different architecture, you can use this script as an example to write your own. Please let me know if you write such a script; I'll add a link to it from here.
Wonderful Article!I found your article by searching for bits and pieces of information that I knew and refining the search as I learned more. I finally wound up using google and searching for "user_tables, user_tab_columns, user_constraints, user_con_columns".
After reading your article I am somewhat satisfied that I'll be able to document this system that I've recently inherited. Every time I try to enter data, I seem to always get some sort of violation in one form or another.
One thing that I've found out is that the SQL used to create these constraints are stored in a column in LONG format and the environment defaults don't always let you see the complete statement. I found entering 'set LONG 9999' on the command line allowed me to read the entire command.
-- Howard Dansby
Real hands-on hintYou didatically conduct the reader inside the Oracle Misteries. :-) With its information, I wrote a Shell Script for take a snapshot of an entire schema. Congratulations, not only for this article, but your entire homepage. That's a really good stuff. Go on, you are in the rigth way!
-- Marcio Siqueira
Superior Article Written for All to UnderstandI have been battling with getting comments in the data dictionary for some time now, and through your simplification, it was a matter of moments and my Table and Column comments have been saved. Now I have visions of many new views that will assist the users of my application,
I wish I had found your site earlier. This is a site all newcomers to Oracle should have bookmarked. There is a tremendous amount of useful information written to a broad audience of readers.
Many Thanks to you Eve! You are a marvelous asset to the Oracle user community.
-- Richard Sidebottom
Additional tipVery useful information - and all in one place! Thanks.
I often use these techniques in combination with spooling to generate actual scripts.
By inserting literal SQL code and concatenating with output from queries, very powerful scripts can be generated automatically.
set head off set pages 0 set trims on set lines 2000 set feed off set echo off set serveroutput on spool sequences.sql select 'CREATE SEQUENCE ' || SEQUENCE_NAME || ' INCREMENT BY 1 START WITH ' || to_char(LAST_NUMBER+1) || ' MINVALUE 1 NOCYCLE CACHE 20 NOORDER;' from user_sequences; spool offwill generate the file sequences.sql that contains ready to run statements to re-create the sequences.
-- Ola Strandberg
informative articleI must appreciate author of this article. It is really very useful. I know about all this catalog table but now I know how best I can use them.
-- Vikrant Srivastava
Great HelpVery useful info.
Thanx a lot,
Allan Reyes BSc Computing Science
-- Allan Reyes
Great articleIt helped a lot in my integration work...
-- malyadri beegala
Perfect!I have worked with Oracle for 16 years. Your reverse engineering ideas were elegant. My latest project: shadetunes Thanks!
-- Don Shade
Very useful informationHave eeen seaching the oracle dictionary for a long time. Now got it from your simplified article.
-- zhang qg
ExcellentI am a project trainee and doing a project on developing a data base browser, your tutorial provided a great help to me for reverse engineering the data model and getting more details out of the oracle data dictionary
-- kaushal pandya
Tanks for this helpful article
Take this as an additional hint:select t1.owner, t1.table_name, t1.constraint_name, t1.column_name, t1.position, t3.table_name parent_table, t3.column_name parent_column from user_cons_columns t1, user_constraints t2, user_cons_columns t3, user_constraints t4 where t1.table_name = t2.table_name and t1.constraint_name = t2.constraint_name and t1.owner = t2.owner and t2.CONSTRAINT_TYPE = 'R' and t4.constraint_type = 'P' and t4.constraint_name = t2.r_constraint_name and t4.OWNER = t2.R_OWNER and t4.table_name = t3.table_name and t4.CONSTRAINT_NAME = t3.CONSTRAINT_NAME and t4.owner = t3.owner and t1.position = t3.position order by t1.table_name, t1.constraint_name, t1.position
This query gives the columns involved in a referential integrity constraint in both tables. Is a bit time consuming but helps in case the column_names in both parent and children differ.
Hope this helps
-- Carlos Brice?o
Great Work!Very useful information organized in a very userful manner. So you do...value integrity!
-- Umair Sani
data modelGreat work
-- dileepkumar kurra
Great Article!Data Dictionary - * Well illustrated * Simple * Well structured/organised
I appreciate it!
-- Harihara Puthiran Sundaramurthy
great resourceH I am Shashi , a computer science grad. I was working on a project and using Oracle 9i as the backend, I was facing a little difficulty and while googling for help I came across eveandersson site ...I found more than what i wanted. I really appreciate the site and its content. Everything is just wonderfully explained and I am completely impressed.
Thanks a lot Eve !
Shashi. Hyderabad. India.
-- Shashi Harige
This was maginifcent! I searched the Oracle documentation regarding this information, and could not find many of the bits I needed. You have provided information vital to the completion of an internal project. Thank you!
-- Mark Evans
Useful Oracle -- free of charge!Eve,
Thanks for your very useful article. You are a unique and refreshing resource; I haven't come across many folks on the 'net who are open-minded enough to give away this kind of information. Experienced Oracle users who already know this stuff are either trying to sell a book, sell a paysite, or sell themselves. Your presentation is straightforward and unpretentious -- and, best of all, accessible to someone who knows databases other than Oracle. I have been moving an intranet web tool I created from a Postgres base to an Oracle base for my client, and your article has been very helpful, indeed.
I also browsed a few links on the rest of your site, and you are a formidable person. I was amused to find that you are from Seattle, but are currently at Neumont University in Salt Lake. I am from Salt Lake, but came to the Seattle area about the same time you left it. I must say, however, my journey was a dull straight line compared to yours. You appear to have circled the globe to get from here to Salt Lake! You have seen (and probably done) things I won't get to see and do if I lived another 100 years. So, thanks for the great photograhy that gives us a vicarious peek, and, especially, for the spot-on technical stuff. Publish more, please!
-- Cory Burt
Big ThanksCurrently i have been assign to a new project and definitely a new application that does not have any documentations. Big thanks to you...the reverse engineering info had helps me to speed-up my study of the new application.
-- Arbaniah Mohamed
Well written.Found this to be very useful.
-- Jayasankar Peethambaran
Great referenceBut it misses one piece, when you create function-indexes, like "create index xxx on some_table(substr(some_column, 1, 3))" then the user_ind_cols contains something like "SYS_NC00012$" in column_name. The original function can be found in the SYS.COL$ view.
-- Michael Bellefroid
Valuable sourcesCurrently i developing a system for my software engineering project. The system which is i developing is generating documentation from PHP system using reverse engineering tecniques. So, i still confusing about this system. I dont know how to trace classes, functions to generate UML diagrams. System that i doing is building on java prograaming language. This artical one of the sources that really helps my project. Thanks
-- saravanan vj
Your site is very useful.
-- Jim Dobson
Excellent ArticleI had some doubts regarding the Reverse Engineering but this article helped me immensely. Great Work!!!
-- Hemant Jain
Nice Paper in helping implementation to be done with built-in intelligenceEve,
Thanks for providing this reverse engineering a Data model which I used in one of the tools developed by me and able to put in a lot of built-in intelligence for the tool. Thats a tool for populating the data from Excel workbook to Oracle databse. This helped me to validate various Integrity constraints with out having to depend upon JDBC Classes in Java.
Thanks a lot for this and look forward for this kind of papers in future
Regards, Hareesh Ram Chanchali
-- Hareesh Ram Chanchali
Good oneNicely done
-- Ravi Narayanaswamy
Thanks!Been strugglnig with the concept of an Oracle database that allows me to code up views and procedures, but not see see what I just wrote... your article gave me what Oracle left out of its documentation! Thanks, it just made my life a lot easier.
-- Julian P
Great work!Thank you! Your tutorial was very usefull to me!
-- Abrantes Filho
Index Reverse EngineeringThanks. It helped me a lot. I have added Index reverse engineering details.
SELECT ind.table_name TableName,
ind.index_name Name, '' TYPE, indcol.column_position SeqNo,
DECODE(ind.UNIQUENESS, 'UNIQUE', 1, 'NONUNIQUE', 0) IsUnique,
DECODE(cons.CONSTRAINT_TYPE, 'P', 1, 0) IsPrimary,
DECODE(indcol.DESCEND, 'ASC', 'Ascending', 'DESC', 'Descending') SortType,
FROM USER_IND_COLUMNS indcol,
USER_TAB_COLS tabcol, USER_INDEXES ind
left outer join USER_CONSTRAINTS cons ON ind.INDEX_NAME = cons.CONSTRAINT_NAME AND
cons.CONSTRAINT_TYPE = 'P'
WHERE ind.INDEX_NAME = indcol.INDEX_NAME AND
indcol.TABLE_NAME = tabcol.TABLE_NAME AND indcol.COLUMN_NAME = tabcol.COLUMN_NAME
ORDER BY ind.Table_Name, ind.INDEX_NAME, indcol.column_position
-- Mani Muthusamy
I am new user of oracle Data Base,after reading this topic its clear my all confusion.Its very very usefull....
Attachment: New Microsoft Word Document.doc
-- Shailesh Vikharankar
Even simpler method...I don't know how old the article above is, but since Oracle 9i, there has existed a rather straight-forward way of "reverse engineering" DDL:
SELECT DBMS_METADATA.GET_DDL('TABLE','SOFTBALL_TEAMS') FROM DUAL;
-- Markku Uttula
Goodthe article is very usefull. If possible mail me about the Data Designer (Modeler) for a data base which is already existing ( Reverse Engineering ).
Thanks Aiyaz email@example.com
-- Mohammed Abdul Aiyaz
This really help me a lot. I want to know one more thing I would like to capture all DDL chnages in all objects made by users by yesterday.Then after I need to apply these changes to all the users.
-- Neha Verma