This document covers in detail the various parts of the DPT database API, and shows how it can be used to access DPT files from your own programs. This could be for example to incorporate M204-like file handling functionality into an application, or perhaps to develop add-ons to the basic DPT toolset, such as DBA utilities, custom $functions, or new interfaces like ODBC and SQL.
You could argue this is all somewhat outside the DPT "mission" of providing a coding and debugging platform for mainframe UL programmers, but since there is a formalised database API layer (on top of which the commands and User Language layer was built), it seems reasonable to make it available. It could also be seen as adding to the "M204 emulation" aspect of things, since this configuration approximates one or more of the IFAMs.
See also the source code download package which includes a variety of walkthroughs, ideas and source code examples. The code snippets throughout this guide below should all work.
In the above diagram, there are 5 interfaces shown, as follows:
DPT's "home" compilation platform is Microsoft Visual C++ 6, but since that is becoming rather old (it's over 10 years old now), and also is not free, assuming you could even find a copy, most users will be wanting to use other compilers. The notes that come with the source code download contain more up-to-date information about which compilers currently give successful builds, and any specific tips about getting things to work in different cases. At the time of writing, gcc and MSVC++ "express" (the free version) are the best bets.
If you link DPT with a single-threading runtime, it should work OK if you only write a single-user application. In a multi-user application multi-threading libraries should be used, if there is a choice, to ensure correct operation. Usually these days there is no choice so this is not an issue any more.
API Tiers
Service Objects
The simplest program
API Tiers
The DPT main host application dpthost.exe is constructed as three layers, each with a distinct level of functionality. To write a custom database API program you need to be aware what these are, and where your code will fit in:
If you are writing an API client program which will interact with a user or users, it will have to provide a user interface itself, since the familiar DPT interfaces (the "session layer" above) are not included in that configuration. If you are a source code contributor writing a custom $function or enhancement to the main host system, the "session layer" will be present and you can work with it.
Apart from these issues, the final application you get will behave much like the regular host system in many ways, such as its handling of startup, recovery, the audit trail etc, and the system config and DBA guides a will be equally applicable.
Service Objects
DPT is fundamentally a multi-user application and the API, being a central part of things, therefore provides multi-user functionality.
The way this is managed is to provide each user thread with a set of "Service Objects", which keep track of which shared resources within the system that thread is using. All operations are performed through member functions of these service objects, which ensure correct sharing, and clean up in the event of errors or just normal termination. As a simple example, a thread's GroupServices object keeps a table of what temporary groups the user has created, and ensures they're deleted when the user logs off (i.e. when the service object is deleted).
In many cases there is some system-wide initialization or clean-up to do relating to particular services, and this is always done by the first such object when it is created. To continue the above example, the GroupServices class as a whole keeps a note of all the system-wide, sharable groups that anybody has defined, and deletes those when the last user logs off.
Luckily in simple cases an API program initially only needs to deal with service objects of type DatabaseServices, one for each user thread required. Creating one of these takes care of the creation of all the other service objects, and deleting it ensures everything is cleaned up for that user.
The main service objects are:
We will encounter all these service objects later on, but for now let's just concentrate on Database Services.
#include "dptdb.h" //All DPT headers you will need using namespace dpt; //Alternatively qualify all DPT identifiers with dpt:: int main() { APIDatabaseServices api; }All the DPT objects reside in namespace dpt, but in this document the namespace qualifier is usually omitted in class names etc, for clarity. This implies that a "using" directive is always in effect, which some would say is bad form, especially in the global namespace as above. If you prefer you can use the dpt:: qualifier. Up to you.
The class name used here (beginning with "API") is an indication that we are using a wrapper class (orange in the diagram) specially prepared for user code. This wrapper layer is discussed in more detail shortly.
The above example creates a single APIDatabaseServices object, and then terminates immediately. Well actually, as you may have noticed if you tried it, it's not quite immediately since the constructor goes through the motions of effectively starting up a host system. This means an audit trail, checkpoint file, etc. will also have been created. What's more, when the variable went out of scope at the end of main(), the destructor will have gone through the motions of closing down the system. With any luck the audit trail should contain a nice clean set of messages. With less-serious problems, an appropriate message will appear in the audit trail. In more serious cases (e.g. the audit trail could not be opened) the APIDatabaseServices constructor will throw an exception, so your code should usually be prepared for this - see later for more on the exceptions that can be thrown out of DPT.
Creating the user API as a stack object like this rather than with "new", helps ensure correct clean-up when the thread terminates by relieving you of the responsibility to explicitly delete it. This is one of the reasons why the wrapper classes are convenient to use. With multithreaded programs (more later) things obviously get more complex since each thread needs to have its own dedicated API object.
Note re. garbage-collected environments
When working in e.g. Python or Java (pink or brown), the variable api above would go out of scope at the same time as in C++, but the APIDatabaseServices destructor would not necessarily be called right in that instant, because of delayed garbage collection. If it's the very end of the program that call will "probably" follow soon, but if you want to have specific control over the process of user logout (and DBMS closedown if it's the last user), you can force it as follows:
... api.Destroy(); //object now unusable ...Apart from this, the variable scope/destructor/garbage-collection issue will generally not be a problem, since many things are cleaned up by DPT.
Now some slight elaborations on the above to make DPT actually do something apart from initialize and terminate.
int main() { APIDatabaseServices api( "myoutput.txt", //Default: "sysprint.txt" "Richard", //Default: it varies "myparms.ini", //Default: "parms.ini" "mymsgctl.ini", //Default: "msgctl.ini" "myaudit.txt"); //Default: "audit.txt" api.Core().Output()->WriteLine("Hello World"); return api.Core().GetRouter().GetJobCode(); //should be 0 here }
Firstly some comments about the constructor parameters.
Output destination: The API requires that each thread has an output destination that M204-style numbered messages can be sent to, and the above constructor assumes that you want to create a text file to take these messages. If you want them to go to some destination that you've already created in your program, such as a Windows list box or something, there's another slightly more complicated constructor.
User name: DPT's security provisions are rudimentary at the time of writing, and the user name is mainly just to help you pick out messages in the audit trail. User 0 (the first API thread in any OS process) is a special case, since the value you give here is ignored. User 0's name is taken from the SYSNAME parameter, which can be set in the parameter overrides file.
Other constructor parameters: The meaning of these three parameters shown above is also explained in the System Config Guide.
The "Hello World" line is accessing the CoreServices service object, and within that the output destination which was supplied earlier as "myoutput.txt". The WriteLine() method is part of a DPT custom interface called LineOutput, discussed further shortly.
The final line uses the MessageRouter service object owned by CoreServices, which controls M204-style numbered messaging (MSGCTL parameter etc.) as well as maintaining various high water marks. Here we're accessing the equivalent of M204's $JOBCODE.
So that's the "hello world" program out of the way. We'll get a lot more complicated later on, but first some background on various interface conventions...
Basic data types used (including custom types)
Managed Results Objects (MROs)
Sets and cursors
Complex input parameters
The C++ Wrappers in relation to DPT internals
Exceptions and messaging
Concurrency considerations
In addition to these standard types, the following simple DPT custom types are also used. The usage of these types is covered in more detail in the walkthrough sections later.
APIFieldValue (defined in fieldval.h)
This is a kind of variant type, although not very sophisticated in that it only holds either a string or an 8-byte floating point value. It is used whenever values are passed into the API which are going to be stored in the database, or when extracting values from the database. (An important thing to be aware of with the DPT database is that it only provides this very limited range of two storage types).
The numeric constructor ensures that numeric values pass the RoundedDouble validation. Prior to DPT version 3.0 the string constructor also ensured a length of < 255 bytes, but this restriction has now been lifted because of BLOB fields. The API functions which use these objects will now reject such superlong values where appropriate (i.e. everywhere except when using non-BLOB fields).
APIFieldValue fv1 = 3.14; //ok APIFieldValue fv2 = "Jones"; //ok APIFieldValue fv3 = std::string(256, '*'); //ok for V3+ if (fv1 == fv2) ... //different types - exception thrown
The decision to use this variant type was partly because in many cases it makes for less cluttered interfaces and more concise calling code. Also it was because Model 204 programmers are used to the behaviour of User Language where a program continues to work exactly the same if the DBAs change the type of a field in the database. This situation can not be achieved exactly in a strongly-typed language like C/C++, but we can get much closer with a variant type like this. When calling database-related functions, the API will internally convert the supplied value to the appropriate type if required. (Note: The conveniences of this object are one of the most painful things to lose when using the DPT API via its C wrappers).
The APIFieldValue class also provides simple string extraction functions, which can be convenient when reading database information out.
APIRoundedDouble (defined in floatnum.h)
This DPT custom type is used to impose M204-like handling of floating point numbers (see also the comments re. User Language number handling. The constructor perhaps throwing an exception (see below) if the number is invalid according to the M204 rules.
APIRoundedDouble rd1 = 3.33333333333333333333; //ok - but gets rounded to 15 dsf APIRoundedDouble rd2 = "10"; //ok - string constructor provided APIRoundedDouble rd3 = "junk"; //invalid - exception thrown APIRoundedDouble rd4 = 1E76; //out of range - set to zero like in UL APIRoundedDouble::SetRangeThrowOption(true); APIRoundedDouble rd5 = 1E76; //out of range - exception thrown
Internally the DBMS code applies these rounding rules to all input, but you can explicitly create them as above if you really insist on full control in the calling code. Mostly the class is exposed for its extraction/printing functions, and the throw option shown above.
Automatic type conversions
Where possible the API is defined to make use of the C++ compiler convention where parameters of the form "const T&" can be supplied in user code as any type for which a constructor for a temporary intermediate object can be inferred. This means calling code can be made briefer and more readable in common cases such as C-style strings (char*) for std::string or APIFieldValue parameters, or numeric literals for APIFieldValue parameters:
f.FindRecords("SURNAME", FD_EQ, "SMITH"); //this f.FindRecords("SURNAME", FD_EQ, FieldValue("SMITH")); //instead of this
Symbolic constants (mostly defined in apiconst.h)
Many operations use special symbolic constants as a way of specifying certain parameters. For instance FD_EQ as shown above to mean an equality find; or e.g. FILEORG_RRN can be used when creating a file to mean "reuse record numbers".
Where this is possible the function prototype or comments in the header file will make it clear, for example using a typedef.
Line-mode input and output
There are a few situations, such as when using sequential files, and the APIDatabaseServices constructor, where your code may interact with DPT objects of class "LineInput" and/or "LineOutput". These are abstract classes allowing simple line-mode (CRLF-terminated) I/O, and if you create your own derived classes you can provide alternate interfaces to some features of the API.
64 bit integers
Some DPT internal variables are maintained as 64 bit integers, but where these are offered to the user API (statistic values for example), the functions which supply them return 32 bit integers. In the (rare?) cases that the high-order word is non-zero you can access it if you wish using a special extra parameter. The syntax is covered in the sections below.
Native type output parameters in Java, Python, Ruby etc.
If you have managed to build DPT and get it working from a language with sophisticated data management features such as garbage collection, there are one or two situations where the DPT API provides one or more special variants of a function in order to avoid stepping on the toes of the calling language environment. You should make sure to use these special variation(s) where appropriate to avoid possible runtime problems, or just use them even from C++ if you prefer the look of them.
The issue concerns functions which modify parameter objects (aka "output parameters"). For example the neatest way to access record data in C++ is using the function illustrated in the main walkthrough section of this document, as follows:
class APIRecord { ... bool GetNextFVPair(std::string&, APIFieldValue&, int&); ... }; std::string fname; APIFieldValue fval; int fvcount= 0; while (rec.GetNextFVPair(fname, fval, fvcount)) //use results etc.The calling code here must have 3 variables, which are all modified by the DPT function - one way of looking at this is that the function effectively has 4 return values. At the time of writing, Python wrappers generated by SWIG implement the "int" and the "std::string" parameters as Python native-type objects, which are therefore subject to all that entails in terms of native data management. So for example the Python language environment will be maintaining reference counters and other control information in order to do cool stuff like shared storage for variables with the same value, and of course garbage collection when variables go out of scope. But for this data management to work reliably, all access to such variables has to be via Python language statements, and things are liable to break if underlying storage is modified via other means - in this case by the "black box" that is DPT. Therefore alternative functions are provided which do not modify any parameter objects:
while (rec.AdvanceToNextFVPair()) { std::string fname = rec.LastAdvancedFieldName(); APIFieldValue fval = rec.LastAdvancedFieldValue(); etc. }There are a handful of other, less important, places throughout the DPT API where output parameters are used, and the alternative functions, if present, are commented clearly in the header files.
This term is used to cover a wide range of more complex DPT data types than the simple ones described above. When the API gives you the results of a call, they will come in one of two forms. The first, and simplest, is where the result data is located and copied into your program's local storage. This happens when, say, you retrieve a field value from a record.
Equally common are "managed result objects" (MROs), where the information is located but the API keeps it internally and only gives your code a pointer to it. A good example of this is a record set resulting from a database search. In these cases DPT is taking care of the various resource acquisitions required to prepare and hold the results, and likewise the release of resources if or when your code says to do so.
MROs are actually a very similar idea to service objects. The main difference is that while the service objects are always present, MROs are only created as the result of some request issued to one of the service objects or another MRO. For example, when opening a file prior to doing a database search:
void anyfunc(const APIDatabaseServices& api) { APIDatabaseFileContext custfile = api.Open("FILE CUST"); }This function call creates a result object which is completely owned and managed by Database Services, and will be completely cleaned up, if we don't issue a Close() first, when the thread deletes the APIDatabaseServices object.
After opening, a context object can be regarded as effectively just another service object, which has its own member functions (e.g. find records), which will keep track of activity for the current user (e.g. only keep one copy of a file open even if it's re-opened), and which will clean up all its own managed objects on destruction if user code hasn't requested that first (e.g. releasing record sets and value sets).
Hierarchy of MRO construction paths
Here is a diagram showing which results are supplied by which objects. Square brackets means the result data is handed over to user storage with no strings attached. Otherwise the data is an MRO owned by the parent object. There are several ways of accessing some entities, as per the side remarks saying which User Language statement creates each under the covers.
|
Destruction of MROs
By definition the destructors for MROs are private functions you can't use directly. MROs must be deleted via the same "parent" object that was used to create them. So to close the file above (thus deleting associated resources and invalidating the pointer), you would use:
... api.Close(custfile); }
In cases like these the contents of the set are accessed using another object, a cursor, which is opened once you have the set. The cursor is a child MRO of the set, and when the set is destroyed, so are any cursors open against it. So in this example the CloseCursor() call is optional.
void FindAndLoop(APIDatabaseFileContext& f, LineOutput& op) { APIFoundSet s = f.Find("SURNAME", FD_EQ, "JONES"); for (APIRecordSetCursor c = s.OpenCursor(); c.Accessible(); c.Advance(1)) { APIReadableRecord r = c.AccessCurrentRecordForRead(); op.WriteLine(r.GetFieldValue("SURNAME").Cstr()); } s.CloseCursor(c); //optional f.DestroyRecordSet(s); }
Using cursors has several benefits. One is that it hides the implementation details of the set. A record set is again a good example, since we know that the representation is probably, but not necessarily, a bitmap. Accessing records via a cursor means it doesn't matter.
Another benefit is that code can be written which will generically access different types of set. Taking the above found set example, if you later decide you want to sort the set before looping on it, the loop code need not change since sorted record sets are polymorphically based on the same record set base class as found sets. More details and code samples on this issue, and on particular kinds of cursors, are given throughout the document.
The "CloseCursor" and "DestroyRecordSet" calls in the above example might easily be eliminated with a little RAII groundwork (C++ technique) if you wanted to do so, making the loop more modular and robust in the case of exceptions thrown out of it.
Cursor options
The different types of cursors provide certain general functions (Advance, GotoFirst, etc.), and certain specific functions appropriate to their situation. In many cases it is also possible to control behaviour via options, which can be ORed together for example:
... APIDirectValueCursor c = file.OpenDirectValueCursor(); c.SetOptions(CURSOR_ADV_NO_OVERRUN | CURSOR_POSFAIL_REMAIN); ...Other options specific to each type of cursor are discussed in more detail later where appropriate, but the NO_OVERRUN option is general and tells the cursor to remain in position on the last element in a set when Advance() would otherwise go "off one end" and make the cursor become inaccessible. This is not the default as it's probably a minority taste.
So the complexity of the underlying implementation is hidden with C++ wrapper classes, all beginning with the prefix "API". Most of the interaction your code does with DPT can be via these wrapper classes, although with one or two simple utility classes it's not (LineOuput for example if you use that). The wrapper classes were designed to be simple and easy to use and it doesn't really matter how they relate to the DPT internal interfaces, except to know that they are just very thin wrappers implementing a kind of "smart pointer" for each of the wrapped classes. In other words they mean you don't have to issue "new" and "delete" but just declare stack objects and the wrappers take care of all that.
The following additional notes on how the wrappers are implemented may be interesting background, and perhaps also prove useful in some coding situations.
If you're building DPT from source you have the option of using the wrappers - your calling code will then look like that in this document - or going straight in and manipulating the underlying objects however you like. In general you should have few problems calling the "public" C++ functions of the internal classes, since many were made public with the eventual open source situation in mind. Moving functions from private/protected to public should be done with more caution (as always of course in C++).
Returning to the wrappers, take for example the internal GroupServices class:
class GroupServices { //implementation data //private apis public: //functions "public" to DPT internal code ~GroupServices(); //etc. //functions suitable for user API CreateGroup(...); };This is presented to the user API something like this:
class APIGroupServices { public: GroupServices* target; CreateGroup(...); //Simply passes through to target };
When you get an object of this class all operations you perform on it are simply passed through to the wrapped object via the "target" pointer. The member variable is public, but it would probably be unusual for user code to access it.
You can pass these objects around by value if you wish, and in most cases the copy mechanics are trivial (a single pointer and maybe a reference count manipulation). This is not always true however and generally speaking, as with any C++ object, passing pointers or C++ references gives the most predictable runtime behaviour. For example:
void f(APIDatabaseContext f) { APIValueSet vs = f.FindValues("SURNAME", FD_LK, "J*") //3 million values APIValueSet* pvs = &vs; //Simple - takes address APIValueSet& rvs = vs; //Same - just different C++ syntax APIValueSet cvs = vs; //Does it copy all 3M values? }For what it's worth, in the above DPT example the values are not copied, leaving variables "vs" and "cvs" both targeted at the same underlying data. So passing an APIValueSet object by value as a function parameter would be reasonable. Reasonable in the current implementation that is - there is no guarantee that the copy mechanics for any object will remain trivial in future even if they are now, and it would be best not to rely on this behaviour.
Some types, such as APIValueSet, have explicit "clone" functions for use in cases where you might want to copy the underlying data only occasionally. Hopefully it will become clear during the walkthrough sections later, why and when this separate functionality is provided.
Exceptions and messaging
Many of the API functions throw C++ exceptions when there are errors, rather than setting special return codes, and you have the choice of catching and handling these if you want to (many of the examples below do not bother, for readability's sake). As a minimum you would probably want to wrap your entire program, or the entire session for each user thread, in an overall try/catch block, if only to be able to issue diagnostics before terminating.
In nearly all cases the DPT infrastructure throws exceptions of a single type, namely "dpt::Exception". When you catch an object of this class you can access its contents, which consist of a numerical code plus a more-or-less helpful string value. The numerical code can sometimes be used to test the exception and re-throw it if appropriate. This is not very sophisticated exception handling, but it has the important feature that the exception codes always match the numerical M204-style message codes issued to the terminal and audit trail.
The above example might reasonably have been coded to ensure correct cleanup, a little like this:
void FindAndLoop(APIDatabaseFileContext f, LineOutput* op) { try { //as above } catch (Exception& e) { s.CloseCursor(c); f.DestroyRecordSet(s); char buff[64]; sprintf(buff, "Error in find/loop function, code %d: ", e.Code()); op->Write(buff); op->WriteLine(e.What()); } }The exception code constants are defined in various header files with names of the form msg_xxx.h. One or two of the codes are mentioned in this document where it might be useful to know them.
In most cases before throwing the exception out to your code, the infrastructure will have already taken a certain amount of action, such as releasing resources and memory, invoking transaction backout if necessary, and writing a message, via normal "MSGCTL" controlled message routing.
Special exception types
There is one case where the API throws an exception of a different type, although it is also a type derived from dpt::Exception, and so long as you catch exceptions by reference as shown above, a single catch block can cater for all cases. In this special situation, namely a record locking conflict, the thrown object contains some more information, which can be useful in deciding how to proceed. If and when DPT provides DBMS facilities like M204 UNIQUE fields, a similar convention might apply for the "ON FIELD CONSTRAINT CONFLICT" exceptions.
Concurrency considerations
In general you shouldn't have to worry *too* much about low-level structure locking issues like Critical File Resources, although see the DBA Guide for a little background if interested.
Application level record locking is discussed in later sections, for example those on locking finds and record updates and LPU.
In nearly all cases the API functions release the low level locks they take, which means your code can be confident that it will not deadlock because of locks that are held without your knowledge. One exception to this rule is when you open files and groups. Doing so places a lock on the file and/or group which isn't released until Close(). This however does not raise the possibility of deadlock, because the Open() call will always fail immediately if the file can't be opened - it does not wait.
See later on for some notes about multi-threaded runs.
Issues surrounding file allocation are discussed in the DBA guide, but with the API we can usually reduce things right down to the file name and "dataset name", i.e. the OS file name. The Allocate() function will throw exceptions in situations such as invalid file names or dispositions.
void f() { APIDatabaseServices api; api.Allocate("CUST", "demo\cust.204"); //default is OLD api.Allocate("SALES", "demo\sales.204", FILEDISP_COND); //create if nonexistent ...
Assuming the file has been created and initialized, we can now open it for use. If not see the later section on DBA functions such as defining fields.
When opening, simply specify the same strings as you would on the OPEN command on Model 204, such as "SALES", "TEMP GROUP SALES" etc. The familiar Model 204 priority rules are applied when groups exist.
... //Must be a single file APIDatabaseFileContext cust = api.OpenContext("FILE CUST"); //Might be a temp or perm group APIDatabaseFileContext sales = api.OpenContext("SALES"); api.CloseContext(cust); //or let them get closed automatically when api is destroyed }
When a file context is closed, the MRO becomes invalid for further use.
During the processing of a CloseContext() call, DPT will *not* by default release all resources and locks relating to that context, (foundsets, lists, value sets etc.). Instead if any of these things remain the function fails and throws an exception. Some DPT objects do "quietly" clean up their children, but others like this one do not. Here we would expect child objects to have been explicitly dealt with first, and failing to do so is considered a mistake. You can force everything to be released with an extra "force" parameter on the close function, but this is not something to get into the habit of doing as a matter of course, since you are giving DPT one less chance to highlight logic errors in your code.
Looking at one of the more relaxed cases, it is slightly less less bad form to let files be automatically de-allocated at the end of the run than it is to let them be closed, so Free() is optional there. This function throws exceptions to indicate failure, for example if the named file is not actually allocated. Note that this is different from the Model 204 FREE command, which produces no error message if the file is not allocated.
void anyfunc(APIDatabaseServices* db) { db->Allocate("CUST", "demo\cust.204"); db->Free("CUST"); db->Free("CUST"); //already gone - throws an exception }
Database find operations all result in a set of records, which can be examined, counted, etc. after the search. DPT allows finds to be issued either in a lowish-level (some might say long-winded) C++ style, and (from Version 3.0) a textual way similar to User Language. Which way is more convenient will depend on the application.
Here the syntax is introduced, and in Chapter 5 much more detail is covered.
In this style the field names, operators, and values are all given as separate function parameters, letting the compiler do its usual helpful stuff like type checking, automatic value implication, and so on.
void anyfunc(APIDatabaseFileContext& f) { APIFoundSet s1 = f.FindRecords("SURNAME", FD_EQ, "ADAMS"); //equality APIFoundSet s2 = f.FindRecords("SURNAME", FD_NOT_EQ, "ADAMS"); //reverses bitmap afterward APIFoundSet s3 = f.FindRecords("SALARY", FD_GT, 1000); //range APIFoundSet s4 = f.FindRecords("SURNAME", FD_LIKE, "A*"); //pattern match APIFoundSet s5 = f.FindRecords("PETS", FD_PRESENT); //table B search >:-O APIFoundSet s6 = f.FindRecords(); //Complete EBP copy: all records }
Doing it this way entails constructing a single query string and letting DPT pick it apart into its components. This can greatly simplify GUIs, where users are often given the ability to enter queries by hand, being more convenient than laboriously choosing items from dropdown menus etc. This type of API call however gives less detailed control, and fewer opportunities for the C++ compiler to help us out. Here are the same queries as above:
void anyfunc(APIDatabaseFileContext& f) { APIFoundSet s1 = f.FindRecords("SURNAME = ADAMS"); APIFoundSet s2 = f.FindRecords("SURNAME NE ADAMS"); APIFoundSet s3 = f.FindRecords("SALARY GT 1000"); APIFoundSet s4 = f.FindRecords("SURNAME is like A*"); APIFoundSet s5 = f.FindRecords("PETS is present"); APIFoundSet s6 = f.FindRecords(""); }The section later on Advanced Finds has some more details about the full syntax supported in this style.
In each case above we did not say whether we wanted a character or numeric format search, so DPT goes by the attributes of the field in the database. You can force a particular comparison type if you want to, as shown here. All operators can have their results negated using a FD_NOT_xxx version.
void anyfunc(APIDatabaseFileContext& f) { APIFoundSet sa = f.FindRecords("SURNAME", FD_A_EQ, "ADAMS"); //force char equality APIFoundSet sa = f.FindRecords("SURNAME", FD_A_NOT_EQ, "ADAMS"); //inverse * Text style APIFoundSet sb = f.FindRecords("SALARY IS NUM GT 1000"); //force num range APIFoundSet sc = f.FindRecords("SALARY IS NOT NUM GT 1000"); //inverse }It could be argued that using these special operators enhances code readability, although you would have to be careful not to trigger a table B search (see below). On DPT there is only ever one type of index per field, so unlike Model 204 there's no time we'll ever find ourselves in the situation where we have to tell the database which index to use.
Remember that in User Language "field IS NOT LIKE pattern" picks up only records which have a value, which does not match the pattern, whereas "NOT field IS LIKE pattern" picks up records with no value at all. This distinction is provided by DPT as the operators FD_UNLIKE and FD_NOT_LIKE respectively.
Foundsets are one of several kinds of record sets (see later), which all share a range of abilities in common. Some obvious ones:
void SetInfo(APIFoundSet& s) { cout << "Count: " << s.Count() << endl; s.Clear(); }Clearing a set is equivalent to the User Language RELEASE RECORDS statement in that it just empties rather than destroys the set object. To destroy it go via the context object that created the set. There is no API equivalent of the UL RELEASE ALL RECORDS statement (i.e. clear all sets in all files), although there is a function to destroy all record sets.
More interesting record set operations such as looping and combining are covered in their own chapters later.
A search where the number of table B records to be scanned exceeds MBSCAN makes a "Do You Want To?" callback query with the user thread, with a default answer of false (don't do the table B search).
If you do install your own callback function to deal with this rather than taking the default remember that, as on Model 204, this is a delicate moment because the DBMS is waiting with quite a lot of resources locked. For example a find with any table B search criteria takes a share-mode record lock on every record in the file before it begins, and this lock will still be in place when the DYRWT is called. In addition there will be a lock on the EXISTS CFR, and possibly also the INDEX CFR.
void f(APIFoundSet& s) { APIRecordSetCursor c = s.OpenCursor(); //starts at first record c.Advance(); //default = 1 record c.Advance(10); //now at 11th record if (!c.Accessible()) cout << "No more records"; c.GotoLast(); c.Advance(-1); //move backwards if (!c.Accessible()) cout << "No more records going backwards"; }All DPT cursors go to the start of the set when you open them. However, if the record set (or other container) is empty, the cursor will be marked straight away as "inaccessible". When advancing past the end of the set and the cursor becomes inaccessible, it is *not* possible then to backtrack by one to get the last record.
RecordSetCursor objects do not hold any kind of structure locks on the file over and above those held by the set itself. Only if you stop to pay closer attention to one of the records (see next chapter) do enqueueing considerations come in. This is the same as User Language where you can e.g. print $CURREC without a record lock being required.
If the set on which a cursor is based is cleared, all cursors open against the set are rendered Inaccessible(). Trying to use an inaccessible cursor will either return non-useful results, or throw an exception, depending on the situation.
Note: In certain cases it may be desirable to use random access into record sets rather than the strict directionality available with cursors. This is discussed briefly in the miscellaneous ideas section.
Accessing the current record
Method 1: Accessing single fields
Method 2: Taking a record snapshot
Accessing the current record
When looping on a set using a record cursor, as in a FOR EACH RECORD loop on Model 204, you are really just traversing a list of record numbers (or often actually a bitmap). This is quite a simple and efficient process, and without doing anything else can also yield some interesting information:
void setlooper(APIFoundSet& s) { for (APIRecordSetCursor c = s.OpenCursor(); c.Accessible(); c.Advance()) { printf("Positioned in set at record number %d in file %s\n", c.LastAdvancedRecNum(), c.LastAdvancedFileContext().GetShortName().c_str()); } s.CloseCursor(c); }The above example is equivalent to a record loop on Model 204 where you printed $CURREC and $CURFILE in the loop. No access to table B is required to get this information - it is intrinsic to the set data structure. However in most situations you do want to get at record data, for example:
void recinfo(APIRecordSetCursor& c) { APIReadableRecord r = c.AccessCurrentRecordForRead(); printf("SURNAME = %s\n", r.GetFieldValue("SURNAME").Cstr()); }The first line here - the creation of the record MRO - is the point at which the EBP is checked for the existence of the record, so if you'd deleted it since creating the found set you'd get the well-loved "non-existent record" exception familiar from Model 204. The second line - retrieving the field value - is when DPT finally goes to table B and scans the record for the requested field.
You might be wondering why we had to declare that we were accessing the record "for read" in the above example - this is not the same as Model 204 right? One reason is that it allows DPT to go a little easier with resource locking under the covers, which saves time. A second reason is that it makes for a cleaner generic class hierarchy including sorted record sets and read-only record snapshots (more on both later). The distinction ultimately results in accessing objects of class APIRecord or APIReadableRecord. The former allows all the record updating functions like ChangeField and DeleteField. The latter only allows read functions.
Method 1: Accessing single fields
This corresponds to the type of access in the User Language statements PRINT NAME, or %x = NAME(2), and is supported by API functions with various overloads which you can choose from according to taste. For example some are shown below, and another is shown here.
void PAI(APIReadableRecord& r, LineOutput* lout) { APIFieldValue v = r.GetFieldValue("NAME") //1st occurrence v = r.GetFieldValue("NAME", 2) //2nd occurrence r.GetFieldValue("NAME", v, 3) //3rd occurrence - alternate syntax int ix = 0; //All occs of all fields... std::string fname; while (r.GetNextFVPair(fname, v, ix)) { lout->Write(fname); lout->Write(" = "); lout->WriteLine(v.ExtractString()); } }When it performs these single accesses to the record on its table B page(s), DPT maintains its position within the record in a similar way to more recent versions of Model 204. Subsequent accesses for a higher occurrence of the same field do not result in a re-scan from the start of the record, and the following is a good way to perform an occurrence loop:
void PAI(APIRecord& r) { APIFieldValue v; int occ = 1; while (r.GetFieldValue("MIDDLE NAME", occ++)) cout << v.ExtractString().c_str() << endl; }
It's worth noting that using the API gives an advantage not available in User Language, namely the ability to get more than one handle on the same record. This is an efficient way of performing a loop on a multiply-occurring group of fields, since each APIReadableRecord object would be maintaining its own position in the record. If the record is updated whilst such a loop is underway, it invalidates the remembered occurrence position, and in such cases DPT simply starts its scan again from the beginning of the record.
BLOB data
Another point to note is that BLOB fields (DPT version 3+) are accessed just the same as normal string fields, with the outcoming FieldValue object containing the extralong value. To access the descriptor value held in table B, or perform conditional processing like the User Language PAI statement does, you must go via the "real" record MRO pointer
(APIRecord*)
. Put another way, the special BLOB functions don't work on cached field value data in e.g. APIRecordCopy or APISortRecord, both covered later. See comments in record.h. E.g.
APIFieldValue desc = r.GetBLOBDescriptor("COMMENTS");The two new functions which request BLOB descriptors just return regular field values if the field is not a BLOB field. For example if COMMENTS above is just a regular STRING field, that call behaves the same as GetFieldValue("COMMENTS").
This is the way the UL compiler implements PAI statements (plus indirectly some others like PRINT EACH, SORT BY EACH etc.) DPT acquires the appropriate file resources once only, and comes away with all the field information for the entire record in a local data structure. This structure is a member of the same APIReadableRecord base class as an APIRecord, so it can be accessed for read operations in the same way.
void PAI(APIRecord& r) { APIRecordCopy copy = r.CopyAllInformation(); //Read exactly like normal now int ix = 0; std::string fname; APIFieldValue fval; while (r.GetNextFVPair(fname, fval, ix)) { cout << fname.c_str() << " = " << v.ExtractString().c_str() << endl; } }
The main way in which these snapshots differ from accessing the record directly is when it comes to updating functions: if you update the real record the snapshot will not be updated (just like User Language).
Multi-criterion finds
Refer-back finds
Record locking and finds
Special criteria and other notes
As introduced earlier in Chapter 2 the API will accept simple queries in "C++ style" or "text style", and this is also true of arbitrarily complex queries. This time let's take text style first as it's more obvious. Simply append conditions to the query string, combining with AND/OR and parenthesizing as required. The syntax is much the same as User Language, although slightly stricter and not supporting the same range of quirks and nuances you can use there. The most important differences are:
The rules are summarised formally here, including a couple of non-User Language extensions to accommodate common language conventions like "!=" for "NOT EQ", and & for AND, and | for OR. Quotes (either type so long as they match) must also be used to enclose spaces in operands. Hex format can be used with quotes to express operands like e.g. x'0d0a'. Query keywords can be in mixed case but operands are case sensitive.
void TextualFinds(APIDatabaseFileContext& f) { APIFoundSet s1 = f.FindRecords("SURNAME = 'PINKETT SMITH'"); APIFoundSet s2 = f.FindRecords("salary > 20000 and (age < 18 or age > 65)"); APIFoundSet s3 = f.FindRecords("DAYJOB='Journalist' & NIGHTJOB='Superhero'"); APIFoundSet s4 = f.FindRecords("year between 1918 and 1939"); }
Now let's look at the more complex C++ style. To set up a multi-criterion find this way you have to build an APIFindSpecification object containing all the field names, operators, and field values, and the appropriate combination of ANDs and ORs. For example, say we want to code an API program to do the equivalent of this User Language statement:
IN VEHICLES FIND ALL RECORDS WHERE - COLOR = BLUE AND (MAKE = FORD OR MAKE = TOYOTA AND (BODY = 2DR OR 4DR)) END FINDTo construct the find specification we must first construct each single condition and then combine them:
APIFoundSet FD(APIDatabaseFileContext& f) { APIFindSpecification cb("COLOR", FD_EQ, "BLUE"); //local variables APIFindSpecification mf("MAKE", FD_EQ, "FORD"); APIFindSpecification mt("MAKE", FD_EQ, "TOYOTA"); APIFindSpecification b2("BODY", FD_EQ, "2DR"); APIFindSpecification b4("BODY", FD_EQ, "4DR"); APIFindSpecification total_expr = cb & (mf | mt & (b2 | b4)); //builds expression tree return f.FindRecords(total_expr); //optimizes and executes }The C++ compiler takes care of AND/OR precedence because the "&" and "|" operators are overloaded for class APIFindSpecification. At run time the FindRecords() call will receive a combined expression with the following structure, which it can evaluate from the leaves back to the root.
Total Expression | ---------- AND --------- | | COLOR = BLUE --------- OR --------- | | MAKE = FORD -------- AND -------- | | MAKE = TOYOTA -------- OR -------- | | BODY = 2DR BODY = 4DR |
If working in a language via wrappers which can't overload these combination operators, use the alternative functions like Splice() to combine criteria explicitly, if less neatly.
APIFindExpression blue ("COLOR", FD_EQ, "BLUE"); APIFindExpression red ("COLOR", FD_EQ, "RED"); //These two will give the same final result APIFindExpression notblue1("COLOR", FD_NOT | FD_EQ, "BLUE"); APIFindExpression notblue2 = !blue; //Negate whole subexpressions APIFindExpression neither = !(blue | red); APIFindExpression allrecs = !(blue & notblue);As ever, complex expressions with double negatives like this are best avoided. Not because the computer has trouble with them, but because the human reader has trouble visualizing what the computer is going to do. DPT actually implements expression negation such as the last one as if you'd said:
APIFindExpression allrecs = !blue | !notblue;Which is if you think about it the same thing (but you had to think about it didn't you?!), since according to De Morgan's theorem !(A & B) = !A | !B. DPT manages all the technical aspects of this (EBP retrievals etc.) in an appropriately efficient way.
Consider a file of 5 segments:
void FD(APIDatabaseFileContext& f) { //records in segments 1 and 2 APIFindSpecification agespec("DATE OF BIRTH", FD_GE, 19900101); APIFoundSet ageset = f.FindRecords(agespec); ...To refine this set of youngsters to those born in England, we have three choices. Expressed in User Language terms they are to do a complete new find using two criteria, or to make use of the work already done and refer back using either "FD IN..." style syntax, or "FIND$..." style syntax. "FD IN..." is usually preferred as it can be somewhat faster - it doesn't read the EBP again. FIND$ can sometimes be neater in situations with many sets to combine.
Here are those alternatives expressed as DPT API calls:
//records in all 5 segments APIFindSpecification ctyspec("COUNTRY", FD_EQ, "ENGLAND"); //Search all segments twice - EQ outranks GE APIFoundSet s1 = f.FindRecords(agespec & ctyspec); //"FD IN..." - searches just segs 1 and 2 APIFoundSet s2 = f.FindRecords(ctyspec, ageset); //"FD FIND$..." - FIND$ outranks EQ - 2 segs, but EBP again also APIFindSpecification referback(ageset); APIFoundSet s3 = f.FindRecords(referback & ctyspec); }Any FindRecords(...) call can be given a final parameter of another found set (or list - see later), and it will use that set as the basis for the find instead of the EBP (i.e. the whole file). In both cases the base set must relate to the same file context as that in which the find is being performed, or else an exception will be thrown.
As with User Language, an API program has the option to request different levels of record locking on found sets. By default DPT places a share record lock, but to create sets with no locks or exclusive locks, simply pass the appropriate symbolic constant into the FindRecords() call. Again as with User Language you should use the default unless there is a good reason not to do so. Some of the implications of reading data via unlocked found sets are discussed throughout this document.
void anyfunc(APIDatabaseFileContext& f) { APIFindSpecification allrecs; APIFoundSet excl_set = f.FindRecords(allrecs, FD_LOCK_EXCL); APIFoundSet no_lock_set = f.FindRecords(allrecs, FD_LOCK_NONE); excl_set.Clear(); //removes exclusive lock and frees memory no_lock_set.Clear(); //just frees memory - still worthwhile ...An API program also has an option not directly available in User Language, namely to retain a set but remove the lock on it:
... excl_set.Unlock(); }If you fail to clear or destroy foundsets explicitly, all their associated resources get be destroyed when closing the context.
Record locking failure
In all cases where this occurs, including situations involving record updates (which require an EXCL lock), the API indicates a failure to get the required lock by throwing a special exception object. This object is derived from a normal DPT "Exception", but also contains all the information that is obtainable from the UL $RLCxxx functions.
for (;;) { try { //operation that clashes with another user } catch (Exception_RLC& e) { cout << "Record locking conflict" << endl; cout << "File : " << e.RLCFILE().c_str() << endl; cout << "Record # : " << e.RLCREC().c_str() << endl; cout << "Enemy name: " << e.RLCUID().c_str() << endl; cout << "User # : " << e.RLCUSR().c_str() << endl; if (/* ask if they want to try again */) { break; //no else Sleep(1000); //small pause? } }Under the covers the RLC exception is only thrown after normal M204-style ENQRETRY processing has taken place. To be more explicit, the API does the work to build the found set, then makes N attempts to lock it, where N is the value of the ENQRETRY parameter + 1, and there is a 3-second wait in between attempts. (The 3 seconds is arbitrary, but is a convention copied from Model 204.) If all these attempts fail, the found set is discarded again and control returned to the calling code with the exception. It's up to you then whether to try again - in User Language this is where the ON RECORD LOCKING CONFLICT unit would be called, and you could code RETRY in there.
Note that this is not somewhere that the API automatically invokes the "do you really want to" callback, but doing so explicitly would be one reasonable way to handle the user prompt in the above example.
... if (!core.InteractiveYesNo("Do you want to try again?", false)) break; ...
Here are some examples - take a look in "apiconst.h" to see all the choices.
void SpecialFinds(APIDatabaseFileContext& f) { APIFoundSet s1 = f.FindRecords(APIFindSpecification(FD_FILE$, "MYFILE")); APIFoundSet s2 = f.FindRecords(APIFindSpecification(FD_SINGLEREC, 100)); APIFoundSet s3 = f.FindRecords(APIFindSpecification("AGE", FD_RANGE, 18, 25)); //Achieve same set as s2 APIFoundSet s4 = f.FindRecords( APIFindSpecification(FD_POINT$, 100) & APIFindSpecification(FD_NOT_POINT$, 101)); }
As with M204, there are two ways of achieving a User Language style value loop. The first is to tell the system to retrieve and cache all the desired values of the field, and walk along the cached values. The second is to tell the system to walk along the actual database b-tree leaf values. The first way is more efficient if you're going to do several iterations over the same set of values, although in a big file the memory required to cache the set might become a consideration. The second way is best in most other cases, uses minimal memory and is simpler too.
Note that you the programmer have to make the decision about which way to code it, based on the field attributes and data characteristics as you know them. The dynamic decision between the two techniques which is sometimes made in User Language is a service of the UL compiler that is not available at API level.
Because of the significant difference in the two ideas, they are supported by two different types of cursor, as shown here.
void ValueSetLoop(APIDatabaseFileContext& f) { APIValueSet s = f.FindValues("SURNAME", FDV_LK, "A*"); //UL "FIND VALUES" s.Sort(SORT_NUM); //UL "SORT VALUES" (bizarre example) APIValueSetCursor c = s.OpenCursor(); //default = first value c.GotoLast(); while (c.Accessible()) { //UL "FRV IN label" APIFieldValue v = c.GetCurrentValue(); std::string sval = v.ExtractString(); //cater for ORD NUM or ORD CHAR c->Advance(-1); } }
void DirectValueLoop(APIDatabaseFileContext& f) { APIDirectValueCursor c = f.OpenDirectValueCursor("SURNAME", CURSOR_DESCENDING); while (c.Accessible()) { //UL "IN F FRV SURNAME" ... c->Advance(1); } }
A direct value cursor has the direction built into it, unlike set cursors we've seen till now which can only control their direction using a negative Advance() value. In the example above a positive increment advances a reverse b-tree walk.
As with M204, any value loop order other than the index order for the field means that the values are going to have to be retrieved and sorted before looping can begin. To do this you must use the ValueSet method. In such cases requesting a sort into the order that the set was already ordered in does nothing and incurs no overhead.
If the field specified is not indexed, it will cause an exception with code DB_FIELD_NOT_INDEXED at the appropriate point.
Even a direct value cursor does not explicitly hold a lock on the b-tree across calls, since loops of this kind might take quite a long time, and it would not be acceptable to block all updating threads. If the b-tree is updated, subsequent calls via the cursor will attempt to reposition at the point closest to the value they left off at. As with M204, this may or may not require the tree to be re-searched from the root, (causing another BXFIND statistic to clock up).
There is no concise equivalent in the API to the User Language FR IN ORDER formulation. The UL compiler provides this functionality with a combination of a DirectValueCursor, a RecordList and a RecordSetCursor.
The API has a couple of built-in facilities for improving efficiency when managing the position of cursors in the b-tree, which you can use as appropriate to your application:
void myfunc(APIDatabaseFileContext& f) { APIDirectValueCursor c = f.OpenDirectValueCursor("SURNAME"); c.SetOption(CURSOR_POSFAIL_NEXT); c.SetPosition("BROWN"); //Make a "bookmark" APIDirectValueCursor ctemp = c.CreateClone(); while (...) c.Advance() //use cursor etc. //Return to bookmark and do something else c.AdoptPositionFrom(ctemp); //Probably faster now than going from the start c.SetPosition("JONES"); }The SetPosition() function will always be a little more efficient than looping the cursor to a particular entry yourself, since it is looping in code one level lower. But more importantly there are times when it is much more efficient, since if the cursor is already some way into the b-tree (e.g. at "BROWN" above) it will only walk from that point, saving on page retrievals. This behaviour can also be more finely controlled or even disabled with options, (SetOption()), as listed in apiconst.h.
CreateClone() is a way to improve efficiency if you have points within the b-tree you want to return to repeatedly. The clone acts as a "bookmark" and avoids having to walk the tree back to its position. As previously described, such cursors will reposition themselves appropriately if the b-tree is updated between their creation and use, and a cursor you left on e.g. "SMITH" might later appear on "SMITS" if "SMITH" got deleted in the interim.
Record store/delete
Field updating
Transaction backout
Re-reading updated data
void Storer(APIDatabaseFileContext& f) { APIStoreRecordTemplate r; r.Append("SURNAME", "THATCHER"); r.Append("SALARY", "100000"); r.Append("JOB", "Prime Minister"); int new_record_number = f.StoreRecord(r); }
If the same fields are used again and again when loading a large number of records, it's more efficient to just change the values each time after the first, rather than using the Append() function shown above to populate both the field name and value each time. A variety of alternative methods on the APIStoreRecordTemplate object make it possible to do that in different ways. (Including cases where some of the field names are fixed with a variable section afterwards - e.g. one or more multiply-occurring groups).
Field name=value pairs are stored into the database the same in the order they appear in the template. There is no equivalent of the UL syntax where a missing value for one of the field names supplied in a STORE statement does not store a value. Empty strings or zeros in the value will always store empty strings or zeros.
If there are any problems with the store, it will throw an exception. The most likely reason might perhaps be an invalid value being specified for a field defined as FLOAT.
void Updater(APIRecord* r) { std::string field("POLICY"); APIFieldValue oldval("NATIONALISED INDUSTRY"); APIFieldValue newval("NO MILK IN SCHOOLS"); r->AddField(field, newval); r->ChangeField(field, newval); //1st occ r->ChangeField(field, newval, 3); //3rd occ, invalid on invisible fields r->ChangeFieldByValue(field, newval, oldval); r->DeleteField(field); r->DeleteField(field, 3); r->DeleteFieldByValue(field, oldval); r->InsertField(field, value, 3); //invalid for invisibles r->DeleteAllOccurrences(field); r->Delete(); //delete entire record //If the old value or occurrence is important e.g. if (r->ChangeFieldByValue(field, newval, oldval) > 0) cout << "Changed OK by value\n"; else cout << "Old value not present, so added\n"; //or e.g. if (r->DeleteField(field, newval, 2, &oldval) == 2) cout << "2nd occurrence deleted, value was " << oldval.ExtractString().c_str() << endl; else cout << "No 2nd occurrence - nothing deleted" << endl; }As in User Language, you can perform a change or delete by giving an existing occurrence number, or an existing value. A well-known User Language quirk is also propagated, namely that a change behaves like an add if the specified occurrence or value does not exist.
The return code from change and delete indicates which old occurrence was affected. -1 means the occ or value did not exist, and either delete had no effect or change was turned into add. There is also an optional parameter to let you retrieve the old value changed/deleted when using the occurrence syntax. This is shown in the final examples above.
If you request the old value to be returned from change or delete it has the type of the data, not the index, if they are different (e.g. STRING ORD NUM). This kind of field attribute combination is not recommended however.
As with Store() above, there is no equivalent of the UL ADD and CHANGE syntax where no value is specified, and ADD does not add a value or CHANGE removes the value. The UL evaluator achieves the former effect by doing nothing, and the latter using DeleteField().
In addition to invalid field values as mentioned earlier, another possible problem in a multi-thread situation is record locking, since all record-updating operations attempt to place an exclusive record lock on the record. If LPU ("Lock Pending Updates" - see below) is on, the lock is not removed again until Commit().
Whole set updates
Finally there are some updating functions that operate on APIFoundSet or APIRecordList MROs.
void f(APIDatabaseFileContext& f) { APIFoundSet ffs = f.FindRecords(); f.FileRecordsUnder(ffs, "TAG", "Retrieve later"); //field must be invisible d.DirtyDeleteRecords(ffs); }The dirty delete works like the User Language "DELETE RECORDS IN label" statement, viz it simply turns off the bits corresponding to the records in the file existence bit map. It places an exclusive record lock on the complete set, which may fail as with the other update statements above. FileRecordsUnder(...) however does not apply any record lock.
Whenever you write application code to update records in a file you should bear data integrity issues in mind. The general topics of TBO and checkpointing are discussed in the System Config Guide - particularly important being your choice of the RCVOPT parameter setting to enable or disable these things. Taking an explicit checkpoint via the API is discussed in the Housekeeping section later, but here here is the syntax for the TBO-related API functions, in the usual semi-plausible setting.
void Backouts(APIDabaseServices& db, APIDatabaseFileContext& f) { printf("%d records\n", f.FindRecords().Count()); //0 (sloppy - see hints and tips) f.StoreRecord(); printf("%d records\n", f.FindRecords().Count()); //1 db.Backout(); printf("%d records\n", f.FindRecords().Count()); //0 f.StoreRecord(); db.Commit(); db.Backout(); //info message - nothing to back out printf("%d records\n", f.FindRecords().Count()); //1 f.StoreRecord(); db.CloseContext(f); //implied commit at close }
Note that exceptions thrown by the infrastructure don't necessarily invoke a backout. However if the exception is thrown so far out that some kind of destructor closes an updated file, any in-flight transaction including that file will get committed as part of the file close.
Re-reading updated data
The system must clearly allow a thread to see the updates it has made itself, even though the updated records are locked exclusively to other threads. In practical terms this has two implications.
One is at the record set level, where you may try to access a record you have already deleted - see earlier comments. In fact this is largely indistinguishable to DPT from the situation where you had an unlocked set and someone else deleted a record before you tried to access it.
The second implication is at the record level, where any remembered occurrence positions currently held for a record are "reset" when the record is updated. Again, this is analogous to the situation where you were looping on the occurrences of some field on an unlocked record, and someone else updated the record. To illustrate this situation, think of this User Language program operating on a record with 4 occurrences:
A: FOR EACH OCCURRENCE OF ADDRESS PRINT VALUE IN A IF OCC IN A = 2 THEN DELETE ADDRESS(2) END IF END FOROn the third time round the loop, the PRINT statement will print what was originally the fourth occurrence, and only three values will get printed in total. This is because the DELETE operation told the Record MRO that it should discard any remembered occurrence position. On being called for the third time, the cursor duly repositioned on on the second occurrence, which was the original occ 3, before advancing.
Another kind of record set is what is referred to in User Language as a "list". It behaves just like a found-set without locks in all the functions that have been discussed so far. Additionally it supports list operations, namely Place(...) and Remove(...), which come in two flavours each for dealing with single records, and sets of records (found sets or other lists).
Unlike in User Language, lists must be declared in an open context before they can be used.
void ListFuncs(APIDatabaseFileContext& f) { APIFoundSet allrecs = f.FindRecords(); APIRecordList minors = f.CreateRecordList(); //empty APIRecordList majors = f.CreateRecordList(); majors.Place(allrecs); //place whole set for (APIRecordSetCursor c = allrecs.OpenCursor(); c.Accessible(); c.Advance()) { APIReadableRecord r = c.GetCurrentRecordForRead(); APIFieldValue v = r.GetFieldValue("AGE"); if (v.ExtractNumber() < 18) { minors.Place(r); //place single record majors.Remove(r); } } }Cross-reference problems with lists, such as trying to place records from one file context on a list declared in another, will throw exceptions. Remember this applies to group contexts too in that records from one of the members still can't be placed on a list declared against the group.
A sorted set is an in-memory data structure consisting of objects, one for each record in the pre-sorted set, which contain field data off the pre-sorted records. It can be counted, and iterated with a cursor just like the found sets and lists we've talked about already.
Being a copy of the data on the actual records, each of the sorted records actually the same class as the record snapshot structure we saw earlier. However there is a difference in that the sorted records do not necessarily contain all the field/value pairs off the original records.
When creating a sorted set you have a trade-off to consider between the time it takes to collect and sort the raw data, and the time you might save in later usage of the set. This trade-off is similar to the one in User Language which you can control by using either the "SORT RECORDS" statement or the "SORT RECORD KEYS" statement. Since the reader may not be that familiar with User Language, and since in any case the DPT UL compiler works a little differently (better!) to Model 204, and in any case also the options available to an API program are wider, the following discussion takes things from first principles.
During a Sort() call, DPT first loops on all the records in the base set, collecting some or all of the fields from each record into a kind of structured array. Which fields are collected is determined by the sort specification supplied by the calling code.
Then the array is sorted, using the C++ standard library "stable sort" algorithm. Only pointers are moved around during this process, so the number of fields collected earlier would, in an ideal world, have no effect on the speed of the sort. However, collecting more fields than necessary is not good because it means the collection phase probably takes longer, plus it might constrain std::stable_sort() which is an algorithm which does benefit from more available memory.
What is most likely to affect the speed of the sort though is how many fields you say should be compared to ascertain the correct ordering of records, (and in reality how alike the records are and therefore how much comparison needs to be done). The sort keys are the other main part of the sort specification supplied by calling code.
A simple example:
APISortRecordSet Sortem(APIFoundSet& s) { APISortSpecification spec; spec.AddKeyField("SALARY"); spec.AddKeyField("AGE", SORT_DESCENDING, SORT_NUMERIC); spec.AddDataField("SURNAME"); return f.Sort(spec); }DPT first collects the three fields off each record in the unsorted record set, then performs a sort on SALARY, and within that descending AGE if necessary. The result is an array of record copies, each apparently containing 3 fields, which can be iterated using a record set cursor as previously seen with found sets.
Firstly the read-only interface works by default off the sorted copy of the records. So in the above example if we ask for the field "MIDDLE NAME" from the records in the sorted set it will appear to be missing, even if the records in the file do actually contain the field. Secondly the read-write interface causes the actual record to be updated on disk but does not affect the sorted copy.
//c is a cursor on the above sorted set APIFieldValue v; APIReadableRecord rr = c.AccessCurrentRecordForRead(); v = rr.GetFieldValue("MIDDLE NAME"); //missing APIRecord r = c.AccessCurrentRecordForReadWrite(); r.AddField("MIDDLE NAME", "SPENCER"); v = rr.GetFieldValue("MIDDLE NAME"); //Still missing v = r.GetFieldValue("MIDDLE NAME"); //"SPENCER" //Alternative function name APIRecord real = c.AccessCurrentRealRecord(); v = real.GetFieldValue("MIDDLE NAME"); //"SPENCER" }The confusion that might be caused by the sorted copy getting out of step with the actual record is both the reason for the separate record-accessing function names in DPT, and the changes CCA made to User Language a couple of versions ago to clarify updating against sorted sets. The "sort keys only" option covered below adds yet another variation on this theme.
Firstly on the subject of keys, you can specify a User Language style "BY EACH" ordering for any key. For an explanation of what this means consult the Model 204 User Language manual. Essentially the option is for dealing with multiply-occurring fields, and usually results in a sorted set containing many more records than the original.
There is also a corresponding option for non-key fields. In this case if you say collect each occurrence it just means that the sorted records will contain all occurrences of the specified field instead of just the default of the first occurrence. There is no effect on the number or order of the records in the final set.
A similar but more all-encompassing option can be used as a shorthand to say collect all occurrences of all fields. This is what you would do if you were planning to do a total dump ("PAI") of the set.
You can also limit the number of records from the original set that will participate in the collection phase. This is commonly used to get a small sample from a large file for diagnostic purposes. In the following example the sorted set will contain only 100 records, with data collected from the first 100 in the found set. It corresponds to the User Language syntax "SORT 100 RECORDS...".
APISortRecordSet Sortem(APIFoundSet& s) { APISortSpecification spec(100); spec.AddKeyField("JOB TITLE", SORT_DESC, SORT_RIGHT_ADJUSTED); spec.SetOptionCollectAllFields(); return f.Sort(spec); }Finally is the "keys only" option which affects both the collection phase and the later processing phase. If you set this option the collection phase only collects the key fields, and nothing else, from the original data. It also sets a flag in the structure representing the sorted set to say that all requests for field information should go through to the corresponding actual record in the file (even the keys). In other words if you use "AccessCurrentRecordForRead()" as discussed earlier, it actually invokes "AccessCurrentRealRecord()", as with User Language "SORT RECORD KEYS".
Using the "keys only" option gives a result set that is therefore the most behaviourally interchangeable with using a found set, but sacrifices any benefits that would have come from doing a single pass across the raw data and then making multiple accesses to the sorted copy. For example on Model 204 a typical scrolling 3270 list display program would be significantly slowed down if it had to keep accessing the same records on disk as the user paged up and down.
Summary
File creation and initialization
Field definition maintenance
Sizing, loading and reorging
Sequential file access
Viewing and resetting file parameters
Summary
Most of the DBA functions correspond to M204 commands, and more general background information in all cases can be found in the DBA Guide.
All these functions require the complete file in exclusive mode. In the case of CREATE, the file must not be open at all. In all other cases you must have the file open as a non-group context, and the usual shared enqueue will be upgraded to exclusive for the duration of the function. Therefore you should be prepared to catch a sharing-violation exception, which will be thrown either if other threads have the file open, or if the calling thread has got any open found sets etc. against the context.
All DBA functions are non-backoutable, and DPT, like Model 204, will not allow backoutable and non-backoutable updates to be mixed in the same transaction. When writing an API program this is more of a consideration than it is working as a Model 204 terminal user, because of the implied commit which always comes at return to terminal command level on M204. The API throws an exception if you try a DBA function while there are still uncommitted data updates. The reverse situation cannot occur on DPT because DBA functions all commit when they finish.
File creation and initialization
Like on Model 204, file creation must be performed with the file closed. The API function call can be given anything from zero to eight file parameter values, corresponding to the most common non-default parameters that are used when creating Model 204 files. Other lesser used parameters can be changed later by resetting them.
Calling Create() won't ask you for confirmation like the Model 204 CREATE command does! It does however ensure that if the file exists already nobody has it open.
void f(APIDatabaseServices& db) { db.Allocate("SALES", "sales.dpt"); int bsize = 1000; int brecppg = 22; int dsize = 2000; int fileorg = FILEORG_REUSE; db.Create("SALES", 1000, //bsize 22, //brecppg -1, //breserve -1 means take default -1, //breuse 2000, //dsize -1, //dreserve -1, //dpgsres FILEORG_UNORD_RRN); //fileorg }
The only valid FILEORG values are 0 and 36. The bits for RRN and unordered (0x04 and 0x20 respectively) are separated for M204 familiarity purposes, but on DPT they must be specified as a pair.
Field definition maintenance
Fields can be defined in a file at any time, even when the file already contains records - this is one of the great strengths of Model 204, and by extension DPT. All existing records in the file are considered simply to be missing the new field. Defining fields can only be done in a context which consists of a single file - no groups. However you can query field attributes in a group context, since they must be largely consistent.
In User Language and at the Model 204 command line, field names are occasionally specified with embedded quotes in order to circumvent the parsers' normal behaviour when it comes to reserved words. This is not necessary when using the API, and field names should be given exactly as intended. Contradictory or incomplete combinations of attributes result in an exception being thrown.
The set of field attributes for a field is specified as a structure, "APIFieldAttributes", and this structure is used for both defining and querying.
void Def(APIDatabaseFileContext& f) { std::string fname("SALES Q1"); APIFieldAttributes fatts(FDEF_STRING, FDEF_ORDERED_CHARACTER); f.DefineField(fname, fatts); //Retrieve just as we defined it fatts = f.GetFieldAtts(fname); //Perhaps redefine fatts.SetInvisibleFlag(); f.RedefineField(fname, fatts);
If you want to display all the fields and their attributes this is done with a cursor. Manipulating the cursor is conceptually the same as using the $LSTFLD function in User Language.
... //Emulate 'D FIELD (ABBREV) ALL' for (APIFieldAttributeCursor c = f.OpenFieldAttCursor(); c.Accessible(); c.Advance()) { cout << c.Name().c_str() << ": "; APIFieldAttributes a = c.Atts(); if (a.IsInvisible()) cout << "INV "; else if (a.IsFloat()) ... etc. cout << endl; } f.CloseFieldAttCursor(c); }
If you define a field while the cursor is in existence (e.g. inside the loop in the above example), the cursor will not pick it up. Close and reopen the cursor if you need to do that.
Sizing, loading and reorging
Before version 3.0 DPT files had to be loaded using StoreRecord() and/or the field updating functions as covered already. From version 3.0 onwards you also have the option of using the fast load feature. This will certainly be faster if the input data can easily be arranged in one of the accepted formats, since DPT parses the input file efficiently, and inserts information into the database via the most direct possible paths.
Here are the function names, called with default parameters. All these functions accept various options, format specifications etc., as per the equivalent commands and $functions.
void FastFunctions(APIDatabaseFileContext f) { //Reorganize the file f->Unload(); f->Initialize(); f->Load(); //All-in-one equivalent f->Reorganize(); }
If not using fast load, it's still probably best to turn on the deferred update feature, and that is done just like at the DPT host command line, by using special variations of the OpenContext() function. There are two flavours of deferred update processing, namely the original, and rather complex, multi-step process, and the (from V2.14) much simpler and almost certainly much faster single-step process. Examples of both are given below.
Single-step
void Load(DatabaseServices& db) { APIDatabaseFileContext f = db.OpenContext_DUSingle("FIXTURES"); //Load data as normal //StoreRecord() etc. //This applies any final chunk of index updates db.CloseContext(f); }
Multi-step
Obviously a lot more complicated, and not really recommended from V2.14 on.
void Load(DatabaseServices& db) { //Allocate the two work files APISequentialFileServices ss = db.SeqServs(); ss.Allocate("TAPEA", "tapea.txt"); ss.Allocate("TAPEN", "tapen.txt"); //V2.14 note new function name APIDatabaseFileContext f = db.OpenContext_DUMulti("FIXTURES", "TAPEN", TAPEA"); //Load data as normal //e.g. StoreRecord() etc. //Make the files available for external sort program db.CloseContext(f); ss.Free("TAPEA"); ss.Free("TAPEN"); //Invoke external sort somehow, and wait for completion //ShellExec(...) etc. //Then reopen file ss.Allocate("TAPEA", "tapea.txt"); ss.Allocate("TAPEN", "tapen.txt"); APIDatabaseFileContext f = db.OpenContext("FIXTURES"); //And apply the sorted deferred updates f.ApplyDeferredUpdates(...optional parms as per Z command...); }
You could alternatively structure the whole multistep load so that it consisted of two distinct DPT runs with a sort in between, all controlled by a DOS batch job, and that would correspond to the way you'd more usually do things with JCL and Model 204. (You could even load the data in several separate runs, perhaps over several days, appending to TAPEA and TAPEN each time, before finally sorting the lot and loading them.) The single-program method was used in this example just for neatness, but it is worth pointing out that some different function parameters might be required depending on how you structure the process overall. This is because the DPT API largely mirrors the operation and conventions of the Model 204 ALLOCATE and OPEN commands. Specifically:
The Increase() function is also available, emulating the same-named command from Model 204:
void Incr(APIDatabaseFileContext& f) { //Add an extent to table B f.Increase(100, false); std::vector<int> e; f.ShowTableExtents(&e); //Should now show 3 extents (B,D,B) for (int x = 0; x < e.size(); x++) printf("%c : %d \n", (x % 2) ? 'D' : 'B', e[x]); }
The underlying DPT API also provides some diagnostic functions providing for example the output you would get at a M204 terminal from the TABLEB and ANALYZE commands. These are not currently provided to the user API for simplicity, since for this kind of work you might often be better off allocating a file to the main DPT host and issuing those commands by hand. However, these diagnostic functions could easily be added to the user API in future if required.
Sequential file access
The API contains a service object for managing sequential files as they would be managed on Model 204. That is to say you allocate them to the system like a database file, open them for shared or exclusive access, and then read/write to them one "line" or "record" at a time instead of with fields and values. Note however that there is little benefit to using this API feature unless you want Model 204 style functionality - in many cases the calling language's standard file IO would be equally or more appropriate.
When using a sequential file, a "record" means a CRLF-terminated string, and may or may not have a fixed length. The issue of record lengths and formats is discussed in the DBA guide.
A sequential file is, in fact, a specialization of DPT's general purpose line I/O classes. A particular instance can be used for read or write, but not both. This is because of potential issues writing variable lengthed records (the usual) into the middle of a file.
APIDatabaseServices db("CONSOLE"); APISequentialFileServices ss = db.SeqServs(); ss.Allocate("report", "output\report.txt"); APISequentialFile f = ss.OpenView("report"); f.WriteLine("Report on Probability A"); f.WriteLine("-----------------------"); f.Write ("Sample size: "); f.WriteLine(IntToString(trials)); //etc... ss.CloseView(f); ss.Free("report");
The infrastructure provides support for groups in all the situations you expect from M204, by using the APIDatabaseFileContext object we have seen all through these notes. Simply specify a group-style context name when opening the context, such as "TEMP GROUP SALES", or even just "SALES", since if there is a group with that name it will be assumed that's what you mean even if there is also a file called "SALES".
Then use the context as normal. In cases where a function is not appropriate to group contexts, like defining a field or initializing the file, an exception is thrown. If you want to perform a search just on a particular member of a group, the FILE$ criterion is what you need. This is how DPT's User Language compiler implements the IN GROUP MEMBER syntax.
Group definitions are manipulated using the "APIGroupServices" service object:
void Groups(APIDatabaseServices& db) { APIGroupServices g = db.GrpServs(); std::vector<std::string> members1; members1.push_back("SALES02") members1.push_back("SALES03") members1.push_back("SALES04"); g.Create("SALES", members1, GROUP_PERM); //system-wide group APIDatabaseFileContext f1 = db.OpenContext("SALES"); std::vector<std::string> members2; members2.push_back("SALES00") members2.push_back("SALES04"); g.Create("SALES", members2, GROUP_TEMP); //"temp" = user-specific group APIDatabaseFileContext f2 = db.OpenContext("SALES"); //gets the temp group now APIFoundSet s1 = f1.FindRecords(); //recs in 02,03,04 APIFoundSet s2 = f2.FindRecords(); //recs in 00,04 APIFoundSet s3 = f1.FindRecords(FD_FILE$, "SALES03"); //recs in 03 APIFoundSet s4 = f2.FindRecords(FD_FILE$, "SALES03"); //nothing }When you define a group it creates a hidden system-managed object representing the group, which is used to resolve OpenContext(...) calls. Your code is not allowed direct access to these objects since they are usually shared across all threads. Instead, to request information about them go via Group Services using group names, and it will ensure thread-safe access to the shared information.
There is no built-in periodic automatic checkpointing at the API level - the main DPT host performs this task via a daemon thread in the Session layer. Therefore if you are building an "OLTP" style of application, you will need to ensure Checkpoint() calls are made at appropriate intervals. The simplest way is to spawn a dedicated user thread like the main DPT host does, but with clever coding you could probably make it a sort of shared responsibility between all the "normal" user threads. In non-OLTP applications this would not be necessary, for example in both batch update runs and read-only "OLAP" style applications you might as well turn checkpointing off altogether.
Rollback is not performed automatically by DPT when it starts the first user thread, since sometimes you might want to start the system and examine files in their un-rolled-back state. Therefore if you want the system to perform rollback you have to code for it, and this ranges from very simple to quite complicated. In its simplest form you can let the system perform all the default processing, issuing messages to the audit trail, as follows:
APIDatabaseServices db("CONSOLE"); if (db.Rollback1() == ROLLBACK_FAIL) { //Close down system printf("Recovery failed: %s \n", db.RecoveryFailedReason().c_str()); exit(0); } else { //Close any files opened for rollback and start normally db.Rollback2(); }There are a variety of ways you can make this more elaborate, mainly by installing appropriate callback functions to have the work involved in the main phase (Rollback1()) executed interactively. For example the main DPT host installs callbacks which display a dialog box showing the progress of the various phases of rollback, and offering the user the chance to cancel, or in some cases bypass them.
There are 8 such callbacks you can install, 5 related to the 5 main phases of the recovery process, and 3 related to progress within each phase. For the sake of brevity these are not covered here, but if anyone wants to build such an application, just get in touch with DPT HQ and full boring technical details can be supplied.
On DPT all these objects can be accessed via a single interface, which passes view/reset requests to the appropriate parts of the system. If it's a file parameter (e.g. BSIZE), you must also specify which file to go to.
void f(APIDatabaseServices& db) { APIDatabaseFileContext f = db.OpenContext("SALES"); APIViewerResetter vr = db.Core().GetViewerResetter(); std::string userid = vr.View("USERID"); int maxbuf = vr.ViewAsInt("MAXBUF"); std::string smaxbuf = vr.View("MAXBUF"); //OK - comes back as string vr.Reset("USERID", "George"); //no good - not resettable vr.Reset("FISTAT", "0"); //no good - file required vr.Reset("FISTAT", "0", f); //OK }There are occasions when a Reset() call is accepted, but the actual resulting value is not exactly the same as the one given, for example if the parameter is a collection of bit settings and an invalid bit is given among otherwise valid bits. In such cases the actual value used comes back as the return value of Reset(). This slightly odd convention reflects the way Model 204 handles the RESET command under similar conditions.
void f(APIDatabaseServices& db) { APIDatabaseFileContext f = db.OpenContext("SALES"); APIStatViewer sv = db.Core().GetStatViewer(); cout << "Sys DKRD = " << sv.View("DKRD", STATLEVEL_SYSTEM_FINAL); cout << "User DKRD = " << sv.View("DKRD", STATLEVEL_USER_LOGOUT); cout << "File DKRD = " << sv.View("DKRD", STATLEVEL_FILE_CLOSE); //no good - file required cout << "File DKRD = " << sv.View("DKRD", STATLEVEL_FILE_CLOSE, f); //OK sv.StartActivity("ACT1"); //some long activity int cpu1 = sv.View("CPU", STATLEVEL_USER_SL); sv.StartActivity("ACT2"); //Another long activity int cpu2 = sv.View("CPU", STATLEVEL_USER_SL); //A very active stat - we want all possible bits int hiword = 0; unsigned int loword = sv.View("SEQOUT", STATLEVEL_USER_SL, &hiword); }Statistics are by default returned as 32 bit unsigned integers to keep the API simple, but actually they're held as 64 bits. The final example above shows how to use an optional parameter on View() to request the most-significant word as well when you're really clocking them up that high.
The string passed to StartActivity() can be anything you like - it just helps to distinguish in the audit trail the output which this call triggers. This may be familiar to User Language programmers from e.g. the $SLSTATS function, or the "COMP" and "EVAL" groups studied when tuning a program.
On Model 204 each such situation has a default response which is assumed for example if the user is in a subsystem, or is running on a non-interactive thread type like a daemon. When using the database API these sitiations invoke an optional user-installed callback function. By default no callback is installed and the default action is taken. To install your own you need a function with the correct prototype, like in the following example:
bool MyFunc(const std::string& prompt, bool default_response, void* obj) { //ask the user - display a dialog box - whatever. //maybe show them what the default response will be. //return true=yes or false=no. } APIDatabaseServices db; db.Core().RegisterInteractiveYesNoFunc(MyFunc);After this, the DYRWT prompts will invoke MyFunc() with the appropriate parameters. In an object-oriented environment you might have an particular object instance which you want to handle the interaction, and that is what the third parameter of the callback is for. By default MyFunc() above will be called with a null third parameter, but we might also invoke the following when starting up a user thread:
... class SessionHandler { bool YesNo(const std::string&, bool); }; SessionHandler my_session_handler; db.Core().RegisterInteractiveYesNoObject(&my_session_handler);The registered object can be of any type - your callback function then has to cast it. The earlier example would then look like this:
bool MyFunc(const std::string& p, bool d, void* obj) { return reinterpret_cast<SessionHandler*>(obj)->YesNo(p,d); }This is how the main DPT host system implements the callback. Each thread registers its terminal handler, which then performs DYRWT interactions using regular terminal line I/O where the user has to enter Y/N.
Only one of each service object can be created on a single thread, because the OS thread ID is used to implement most low-level resource locking. Therefore a multi-user application is by definition multi-threaded, with all the enjoyable sharing issues that entails. DPT takes care of safe sharing of all the information under its control, and we'll assume for the sake of this discussion you're happy with your responsibilities for the thread-safety of the logic you build on top.
In the sketched-out example below, user 0 runs on the main thread and this makes sense, since when main() terminates, the application as a whole is closed down by Windows.
void usermain(void* username) { APIDatabaseServices api(/* user specific output etc */); for (;;) { //accept user input, perform work //check for bump } } void listenermain(void* portnumber) { for (;;) { accept(....); //get user name etc. beginthread(usermain, /* etc */); } } int main() { //Run user 0 on the main thread APIDatabaseServices api; //Kick off other threads as required //e.g. listen for socket connections... beginthread(litenerthread /* etc */); //...or just spawn user threads directly beginthread(usermain, /* etc */); beginthread(usermain, /* etc */); //etc. //Do any required work with user 0's API //Somehow decide when to shut down //Close and/or bump other threads return api.Core().GetRouter().GetJobCode(); }Unless user 0 is held up in some way, this application will not stay running long enough for the other threads to do any work. This is the point at which the main DPT host system, like Model 204, can be told to wait with the HALT command, but in an API program you must devise some similar mechanism yourself. This could be done based on shared resources (see later), some kind of messaging, or simply user 0 keeping watch until other threads "seem" to have finished.
In any case, when the time comes to shut down there are a couple of things you should do to make sure everything works smoothly, although in the above example simply alowing user 0's service object to de-scope should be OK, since the end of main() implies application closedown, including all subthreads. However if user 0 were running on a separate thread to main(), the de-scope of its service object would not close down other threads, since the DPT infrastructure as a whole releases shared resources during closedown of the last thread, which need not be the first one started.
Therefore to make a clean job of it when the time comes, one thread should tell DPT not to accept any new users, and then maybe even bump off existing users (see later). The former ("quiesce") is achieved on the main DPT host using the EOD command, and equivalently in an API program with e.g.
... APICoreServices c = api.Core(); core.Quiesce(); while (core.NumThreads() > 1) //wait or use bump if impatient }
void Monitor(APIDatabaseServices& db) { std::vector<int> v; db.Core().GetUsernos(v); for (int x = 0; x < v.size(); x++) { //monitor user...Any DPT API thread can get information about what other threads are doing. This is achieved by for example thread A requesting a handle to thread B's APIDatabaseServices object and then using it somewhat as normal. Typically thread A would only want to view statistics or parameters, but things like getting a list of open files are also OK - that's what the LOGWHO command does in fact. More proactive stuff like trying to open files on behalf of thread B would often fail because of the previously-mentioned thread ID/locking issues.
The big problem with this simple scheme is how can we be sure that thread B doesn't log off and therefore invalidate all their service objects while we're using them? The answer is that we have to lock them in, which is achieved as follows. This is somewhere that the API forces you to use a privileged sentry object to avoid the risk of leaving a user locked in, which might put DPT into a deadly embrace situation.
... APIUserLockInSentry s(v[x]); if (s.other == NULL) //User logged off since we got the list continue; else { //User is now locked in printf("User %d %s\n", x, s.other->Core().GetUserID().c_str()); printf("%s\n", s.other->Core()->GetStatViewer().UnformattedLine(STATLEVEL_USER_SL).c_str()); printf("WT = %d\n", s.other->Core()->GetWT()); printf("Open files: "); std::vector<APIDatabaseFileContext> of = s.other->ListOpenContexts(); ... etc. } }
When thread B is in a "non-bumpable" state, it will take no notice of thread A until it comes out of the non-bumpable state of its own accord. A good example is when thread B is waiting for disk IO. On the other hand, if thread B is doing some kind of CPU-intensive work it will usually be periodically checking to see if anyone's bumped it. You can also explicitly check for bump if your code is doing some intensive processing of its own outside the DPT API, by calling the Tick() function. In all cases the result if the user has been bumped is that an exception is thrown.
//User 5: core.ScheduleForBump(20);
//User 20: try { for ( /* something lengthy */ ) { //perform a slice of work core.Tick("in loop to simulate earth weather"); } } catch (Exception& e) { //e.What() contains the text given to Tick() }
int Counter(APIDatabaseFileContext* f, const char* fname, const char* fval) { return f->FindRecords(fname, FD_EQ, fval).Count(); }The above syntax looks neat but in many cases is not a good idea. It creates a found set and then discards all references to it without destroying it, which means the chunk of memory representing the found set is left lingering around till the context gets closed. If the file was big and this function was called repeatedly before closing it, you might very well run out of memory.
That's very complex though, so a third simpler approach, supported by two new API functions, is to let RecordSets act like random-access cursors for themselves, so long as you know the internal database record numbers. These can be retrieved in an efficient integer array operation even for very large sets, and then used to access records as required by the scrollbar without needing to reverse and/or reopen regular directional cursors. This lets the client cache and de-cache as much or as little record data as it wishes with minimal worries about code complexity or efficiency. Further notes:
Example:
void Scroller(APIRecordSet* set) { int* recarray = set.GetRecordNumberArray(NULL); .... //populate GUI control structures ... //Fill page for (recnum...) { APIRecord rec = set.accessRandomRecord(recnum); ... } //later, if stack array not supplied above delete[] recarray; }
Why C wrappers are necessary
The above factors unavoidably make user code less elegant, but some effort has been taken to ensure the C API is as clear and easy to use as possible.
Why C wrappers are good
Some of the good things that are lost by dropping from C++ to C
This is not really the place to list all the reasons why C++ was an improvement over C, and why it makes interfaces like the DPT database API much more elegant. However, here are a few specific things that seem worth mentioning.
...and some good things that are retained
DPTErrNum BitMappedRecordSetCount(DPTUser u, BitMappedRecordSetHandle hset) { try { DatabaseServices* dbapi = (DatabaseServices*) u; //locate area to CAPICommArea* commarea = dbapi->GetCAPICommArea(); //put results in BitMappedRecordSet* set = (BitMappedRecordSet*) hset; commarea->int_result = 0; /* The actual function call: */ commarea->int_result = set->Count(); //place result there return DPT_OK; } catch(Exception& e) { commarea->error_code = e.Code(); //place error info commarea->error_message = e.What(); return e.Code(); } catch(...) { etc. } }So the overhead per call is an extra call to this C wrapper function, a couple of indirect memory accesses to initialise control data, and a couple of instructions for the C++ "try". The casts are dealt with at compile time so they add no overhead. On top of this there is any overhead added by parameter conversion that would have not been required in C++ - perhaps some cases where STL <string> variables would be used several times rather than incurring multiple strlen() etc. calls when using const char* in C. But in general this is a minor overhead since most parameters are basic types like ints and pointers, and with strings the lengths can by supplied to some functions.
This contrasts with the situation when another language layer like Java is added on top, communicating via the C layer. Experience from the similar Python/SWIG situation has shown that the process of deconstructing parameter variables from their "managed" form in the calling language to fundamental data, and then reconstructing "managed" results objects after the function call, can add a significant overhead in situations where a lot of calls are made to API functions which individually do very little. For example building the StoreRecordTemplate before storing a large record. On the other hand compared to the cost of performing a database search or something all of the API overhead, including parameter passing, is trivial.
#include "stdio.h" #include "dptcapi.h" int Error(DPTUser user) { const char* msg = DPTErrorMessage(user); int code = DPTErrorNumber(user); printf("DPT error code %d, message: %s\n", code, msg); if (user) DPTLogoff(user, NULL); return code; } int main() { char msg[256]; DPTUser user = NULL; DPTDatabaseFileContextHandle hfile = NULL; DPTFoundSetHandle hset = NULL; DPTRecordSetCursorHandle hcursor = NULL; DPTReadableRecordHandle hrec = NULL; /*== Start user thread ==*/ if (DPTLogon("CONSOLE", NULL, NULL, NULL, NULL, &user, msg)) { printf("Logon error: %s\n", msg); exit(999); } /*== Open a file ==*/ if (DatabaseServicesOpenContext(user, "SALES")) return Error(user); hfile = DPTMRO_Context(user); /*== Perform database search ==*/ if (DatabaseFileContextFindRecordsAll(user, hfile)) return Error(user); hset = DPTResultFoundSet(user); /*== Loop on records ==*/ RecordSetCount(user, hset); printf("%d records in found set\n", DPTResult_Int(user)); if (RecordSetOpenCursor(user, hset)) return Error(user); hcursor = DPTMRO_RecordSetCursor(user); for (CursorAccessible(user, hcursor); DPTResult_Bool(user); CursorAdvance(u, hcursor), CursorAccessible(u, hcursor)) { RecordSetCursorAccessCurrentRecordForRead(user, hcursor); hrec = DPTMRO_ReadableRecord(user); /*== Print each F=V pair ==*/ for (RecordAdvanceToNextFVPair(user, hrec); DPTResult_Bool(user); RecordAdvanceToNextFVPair(user, hrec)) { printf(" %s = ", DPTResult_String(user)); if (DPTResult_FieldValueIsNum(user)) printf("%f\n", DPTResult_FieldValueN(user)); else printf("%s\n", DPTResult_FieldValueA(user)); } } /*== Release all resources etc. ==*/ RecordSetCloseCursor(user, hset, hcursor); DatabaseFileContextDestroyAllRecordSets(user, hfile); DatabaseServicesCloseAllContexts(user); DatabaseServicesFree(user, "SALES"); if (DPTLogoff(user, msg)) printf("Logoff error: %s\n", msg); }Notes on the above example (in no particular order)
Before making internal DPT calls, the wrapper code sets the expected result variable for the current function to some default value, namely zero, NULL or an empty string as appropriate.
Multi-result functions
Some functions populate more than one result variable - there are comments indicating such cases in the header files. An important group are the functions retrieving field values, where not only is the value placed in the results area, but also a flag saying whether the value was string or numeric. In some of these functions three or even four result values are cached. For example the RecordAdvanceFVPair(...) function shown here for reading a single record's contents sets a flag saying whether the last field was reached (bool), a string containing the field name, the field value, and the flag saying whether the field value was numeric.
void RecordLoop(DPTUser u, ReadableRecordHandle rec) { while (RecordAdvanceFVPair(u, r) == DPT_OK && !DPTResult_Bool(u)) //end of set? { printf("%s = ", DPTResult_String(u)); //field name if (DPTResult_FieldValueIsNum(u)) //numeric flag printf("%f \n", DPTResult_FieldValueN(u)); //field value else printf("%s \n", DPTResult_FieldValueA(u)); } }Array results
void ListUsersConnected(DPTUser u) { int x, n; CoreServicesGetUsernos(u, "ALL"); n = DPTResult_ArraySize(u); for (x = 0; x < n; x++) { DPTResult_FocusArrayItem(u, x); printf("User # %d:\n", DPTResult_Int(u)); //See samples for more complex LOGWHO-style output. } }Utility classes overlaid with C structs
void RedefineField(DPTUser u, DatabaseContextHandle hfile) { const char* fname = "SALES_2010"; DPTErrNum e = DPT_OK; if (e = DatabaseFileContextGetFieldAtts(u, hfile, fname)) exit(e); //Here we have a stack struct DPTFieldAttributes att = DPTResult_FieldAtts(u); att.flags = (DPT_FATT_INVISIBLE | DPT_FATT_ORDERED | DPT_FATT_ORD_NUM); if (e = DatabaseFileContextRedefineField(u, hfile, fname, &att)) exit(e); }
Utility objects managed via handles
Following on from the above, there are other utility classes which C++ calling code would normally own and manipulate directly but which aren't simple enough to just be overlaid and manipulated via their member variables by C code. Examples are FindSpecification and SortRecordsSpecification. In these cases DPT returns a handle from one or more Create(...) functions, plus there is a Destroy() and other manipulating functions. After Create() the handle is retrieved via a DPTStruct_...() function, to differentiate these objects from the usual MROs where it is DPTMRO_...().
void FindAndPrintCount(DPTUser u, DatabaseContextHandle hfile) { DPTFindSpecificationHandle hspec = NULL; FindSpecificationCreateA1(u, "SURNAME", DPT_FD_LIKE, "A*"); hspec = DPTStruct_FindSpec(u); if (DatabaseFileContextFindRecords(u, hfile, hspec) == DPT_OK) { int x,n; DPTFoundSetHandle hset = DPTMRO_FoundSet(u); RecordSetCount(u, hset); n = DPTResult_Int(u); FoundSetLockType(u, hset); x = DPTResult_UInt(u); printf("%d records in %s found set\n", n, (x == DPT_FD_LOCK_NONE) ? "UNLOCKED" : (x == DPT_FD_LOCK_SHR) ? "SHR" : "EXCL"); } if (hspec) FindSpecificationDestroy(u, hspec); }
if (DPTErrNum(u) == DPT_DML_RECORD_LOCK_FAILED) { int rlcuser = DPTRLC_User(u); //DPTRLC_File(), DPTRLC_Recnum() etc. ... }
This chapter gives some background on how the Java API layer (brown in the diagram) relates to the base C++ code and intermediate C layer, as well as details on how to use it, and some code examples.
The Java API is presented as a set of interfaces, each corresponding to one of the base C++ classes (green), and each Java method corresponding to one of the C++ class member functions and having identical ultimate effects on the underlying DBMS, because ultimately passing through to the same C++ function. Therefore if you've never used the DPT API at all before and are starting with Java, the walkthroughs and programming notes which make up the bulk of the document above should be your main reference, even though they are in C++. (First try the Java "Hello World!" example in the download readme).
This scheme of presenting the documentation is obviously the result of it starting out as a purely C++ API, and will make certain things harder to understand for the reader who only knows Java. This may be remedied in the future with a complete stand-alone Java API document which does not assume familiarity with both C++ and Java, but for now we will have to make do with this brief appendix.
Java string representation
DPT's underlying API layers (C and C++), and its database STRING field type, all work exclusively with one-byte-per-character string representations based on the default character set on the machine where the DBMS is running - typically ASCII. Java on the other hand has a Unicode String variable type which uses two or more bytes per character. This means we have to think about how and when conversion between the two formats will happen, and in what cases it is reasonable for Java client programs to work with String variables constructed from DPT database field values using the standard Java language conversions/constructors.
There are three main cases to consider. Firstly text parameters like user names and field names. There is no confusion here, and Java clients can comfortably use the String interfaces provided by the DPT API, since there will be no weird characters involved. Calling code will look the same as the equivalent C++ if it were using C++ STL std::string or const char*. E.g.
int s = parmViewer.view("SYSOPT"); String fName = "LASTNAME"; FieldAttributes = context.getFieldAttributes(fName);
Secondly there are field values for STRING or BLOB database fields, but where the values are just plain text. So that's all the data-access functions for storing records, changing field values, performing searches etc. If the data is just text, the same comments as above apply, as long as you're happy with the basic ASCII/Unicode conversions.
Finally there are the database fields where the actual binary content is important. Obviously this means BLOB fields if they contain images/videos/etc., but it also includes plain STRING fields if the content of the field has meaning where you need to be in total control of the length and binary value stored. In these cases calling code should use the functions which take and return Java byte[] variables, and/or use FieldValue parameter objects constructed from byte[]s. Values will then get passed directly into/out of the underlying DPT system as single-byte-per-character "strings", without passing through a Unicode representation at all. As an extra facility you can also load the DPT custom FieldValue object with a String value in hex format such as "X'F0123C'" and then call convertHexStringToByteArray() to get it converted to byte[] form, thus keeping total control. That sequence facilitates the use of binary values typed in by end users. For example:
String userEntry = "414243"; //ASCII "ABC" FieldValue fVal = new FieldValue(userEntry); fVal.convertHexStringToByteArray(); record.addField("CODE", fVal);
The reverse function ConvertToHexString() works on an object with any current internal type, but in particular if the object holds a byte[] fresh out of the database, will not go via Java/Unicode format but be directly converted into the hex representation of the bytes - "414243" in the above case if we were to re-retrieve the value just stored.
No parameter modification
This issue affects the cases where DPT C++ API functions have parameters which are passed by reference and modified, for example in this piece of C++ code the function is effectively setting three return values at once:
std::string fname; FieldValue fvalue; while (record->getNextFVPair(fname, fvalue)) cout << fname << " = " << fvalue << endl;In some cases it would be possible to use the same kind of construction in Java, but not others. In the above case FieldValue is a custom object which we could modify, but Java Strings do not allow modification of the underlying object after creation, so that could not be done. But anyway the common Java convention is for functions not to modify their parameter objects, and instead return all results via the actual return value, and this is what the DPT API does, while attempting to keep things broadly similar to the C++. In the case of the above function, the three separate "return values" are collected into a special Java object, FieldValuePairIterator, which is used something like this:
for (FieldValuePairIterator pair = record.getFirstFVPair(); pair.exists(); pair = record.getNextFVPair()) { System.out.println(pair.getName() + " = " + pair.getValue().extractString()); }
In other cases the C++ API uses pass-by-reference purely because potentially large structures are the "result" of the function. In these cases the Java API's function signatures are rearranged so that the large object reference actually is the return value of the function. For example:
//C++ RecordCopy snapshot; record.CopyAllInformation(snapshot); //Java RecordCopy snapshot = record.copyAllInformation();
No implied constructor calls
C++ code is often written using function parameter values which don't actually match those of the function, but which can be used by the compiler to quietly construct a suitable const temporary to pass indirectly to the function. Java does not allow this. Most common
with the DPT C++ API are functions which take a const std::string&, or a const FieldValue&, which can be called by giving a const char* such as "ABC" in either case, since both those classes have constructors which take const char*. The implications of this for the DPT API are really a question of whether it wants to provide the ability to call it with the same convenience as the C++ code, or force the user to code things like:
record.addField("LASTNAME", new FieldValue("SMITH"));As it happens the Java language treats String as a special case, and we can actually use the literal above instead of new String("LASTNAME"), but the issue remains for the DPT custom FieldValue class. The approach taken here is that all functions which in the C++ API take FieldValue have extra overloads in the Java API taking its different "subtypes". This means that the calling code can use the same tidy syntax as the C++. For example here's part of the Record interface:
interface Record { ... int addField(String fName, String fVal); int addField(String fName, double fVal); int addField(String fName, byte[] fVal); //see earlier point int addField(String fName, FieldValue fVal); ... }
Other miscellaneous Java language issues
The above preliminaries out of the way, let's get down to the business of starting a DBMS session. The readme in the download contains a simple "Hello World!" example with instructions on how to get it working in Eclipse. The following notes discuss what's going on there in more detail.
The syntax for starting and stopping sessions is probably the largest difference from the C++. This complication comes from the fact that the Java classes have been designed with an eye towards future support for more than one "route" to the DBMS. Specifically,
Starting/stopping the DBMS host, and initiating user sessions
In the C++ API this was done via directly using the DatabaseServices class constructors. In Java however there is an overall API-management class (the DPTJAPI class mentioned earlier) which amongst other things functions as a factory for DatabaseServices.
At the same time the opportunity was taken to tidy up the function parameters for starting user sessions. For example in the C++ API several of the parameters were only relevant for the first "user 0" session, and ignored otherwise. This is clarified in the Java API by providing two separate function groups, namely startHost(...) and logon(...), meaning that there are never irrelevant parameters as well as making it more obvious when a DBMS host might get started.
For example, on the main thread:
import dptjapi.*; public class MainClass { public static void main (String args[]) { //Object returned is the "user 0" session DatabaseServices db = DPTJAPI.startHost(); //If we want a multi-user system new Thread(new UserThread("User1")).start(); new Thread(new UserThread("User2")).start(); //Do some work or just wait for other users to log off while (db.core().getUserNos("ALL").length > 1) Thread.sleep(1000); DPTJAPI.logoff(db); } }
Secondary users started by Thread.start() above:
import dptjapi.*; public class UserThread implements Runnable { String name; UserThread(String name) {this.name = name;} public void run() { DatabaseServices db = DPTJAPI.logon(name); /* Do some work */ DPTJAPI.logoff(db); } }
Note that an explicit "logoff" call is required in the Java API, in contrast to the C++ API where we allowed DatabaseServices objects to be destroyed as they went out of scope. The difference is just because we really don't want to leave ourselves at the mercy of the JVM garbage collector for something as fundamental as the starting and stopping of sessions. DPT maintains an internal reference to all DatabaseServices objects created and does not delete them till logoff() is called. The logoff(...) triggers the calls to the underlying C++ DatabaseServices destructors, and after that the Java object becomes invalid as far as DPT is concerned, (although the reference still exists as far as the JVM is concerned).
Finally remember that as with the underlying C++ API, the Java API requires that (most) operations via a DatabaseServices object happen on the same thread on which the object was created. This includes logoff, so in client applications which run several users within the same GUI (e.g. the sample file wizard), "closedown" buttons etc. should ensure that they indirectly request threads to terminate their sessions and then wait for them to do so, rather than trying to delete the DatabaseServices objects directly.
Supplying custom LineOutput destinations
(If required).
The default DatabaseServices constructors send user messages to files local to the host, and in the case of the user 0 constructor, the audit trail also to a file and nowhere else. However the option is there in all DPT API layers to supply custom output destinations for the user output streams, and a secondary "echo" destination for the audit trail. For example the main dpthost.exe console display window is an echo destination for the audit trail, and the sample Java "file wizard" application uses the technique to display all users' output as well as the audit trail. Each such output destination must be a DPT LineOutput object, handle, descendant object etc. depending on which API layer it is, with the (customizable) Java version residing in package dptutil. A slight operational restriction when supplying an audit trail echo destination with the user 0 constructor is that it should then be ensured that user 0 is then the last one off (which would not normally be necessary). The reason for this is that the user 0 thread's
JNI environment is used as the C API callback interface for this output. If user 0's stuff gets garbage-collected, the JNI callbacks when writing audit trail lines fail.
Wrapped parameter objects
These ones in C++ would all typically be used in the form of tidy and efficient stack objects. In Java they must be explicitly created with new(), but more importantly we must consider their destruction point. Leaving this to the JVM garbage collector is not perfect since the underlying C++ objects are owned by DPT user sessions and should therefore not be destroyed before their owning sessions. Therefore these objects provide an explicit explicit destroy() call which it is recommended to call after using the object. The class finalize() functions do make such a destroy() call, so it's possible, but in no way guaranteed, things will be OK if you don't do it.
void find(DatabaseFileContext context) { * Recommended FindRecordsSpecification spec = new FindRecordsSpecification("ID=1"); FoundSet allRecords1 = context.FindRecords(spec); spec.destroy(); * Neater but with risk of sequence problems FoundSet allRecords2 = context.FindRecords(new FindRecordsSpecification("ID=1")); }
Java-local objects
These are different from the above because they are implemented fully as Java objects managing their own data, rather than wrappers managing underlying C++ objects. FieldValue is implemented like this partly because it is so heavily used and benefits from avoiding extra JNI trips, and partly because it needs special Java string-handling abilities as covered earlier. In addition to the C++ object's string and numeric variants, the Java version has an internal byte[] variant as well.
StoreRecordTemplate is implemented locally like this partly because it's a very simple data structure (just two arrays), and partly because in Version 3.0 with Fast Load now available, the performance benefit of implementing a special optimized "one-trip" record store function is less essential. DatabaseFileContext.storeRecord(...) via the Java API makes separate DBMS calls to add each field one by one. Note therefore that some statistics like MSTRADD and MSTRDEL will increase if there are errors during STORE, when they would not normally do so.
FieldAttributes and MsgCtlOptions are trivial data structures.
Combination operators
As mentioned earlier, Java does not support overloading of operators such as &=. Therefore combining find conditions by combining these objects is another thing which can't look quite the same as in C++. So we can't say things like this.
spec4 = spec1 & spec2 & spec3The solution is either to use the explicit splice(...) function which remains from C++, or use one of a set of special Java functions which can be stacked. The above example becomes the less elegant:
spec4 = spec1.BoolAnd(spec2).boolAnd(spec3).Having said that, now the text-query feature is available, equally likely user code might be something like:
spec4 = new FindRecordsSpecification( "(" + specText1 + ") AND (" + specText2 + ") AND (" + specText3 + ")" );
Others:
Some more specific C++ data types like RoundedDouble are not used at all in the Java layer. The internal API creates these on the fly as required.
Exceptions
DPT exceptions are thrown in all the same places, as Java class dptjapi.DPTException, instead of dpt::Exception in C++. The Java class is derived from RuntimeException rather than plain Exception, to give the calling code the option not to use "throws" declarations if it wants to follow that convention.
MRO destruction
As with the C++ API it is technically safe, but not good style or use of memory, to leave MROs like found sets "hanging" and have them cleaned up later when their owning object is cleaned up (e.g. when the file owning a found set is closed). The underlying MROs *DO NOT* get garbage collected, so there is no excuse to try and fall into the Java-y mindset of relying on the garbage collector and just letting e.g. FoundSet references go out of scope. (It would not be possible for the DPT API to implement such an auto-destruction scheme anyway, because it would require a strict heirarchy in the order in which things were destroyed: cursors...sets...contexts... etc., and current garbage collectors can not offer such guarantees.)
"Sentry" objects
The C++ API provided a number of utility helper classes of the general type "sentry", making use of the C++ RAII convention. Since RAII is not possible in Java the equivalent functionality is generally provided by "get()" and release()" style member functions in appropriate places.
LineOutput objects
These DPT utility LineOutput objects terminate print lines at hex zero. The generalized underlying mechanism to allow strings containing hex zero exists, but is not currently used, for efficiency reasons in a class that is heavily used if it is used at all. In JNI it is far more efficient to handle strings as character arrays if you're happy with the limitation, because you don't then need to make JVM calls to access Java String object information.
LineInput objects
These return null string to indicate EOF instead of a C++ false bool.
Volume Updates
As previously mentioned, the Java API record-store mechanism is not implemented with optimum efficiency in mind. In a similar vein, the deferred-update APIs are not provided at all. This omission is to encourage use of the much-superior fast load route for volume updates.
API appearance
C++ base Record::AddField("Pi", 3.14) C++ wrapped APIRecord::AddField("e", 2.72) C via DLL RecordAddFieldN("R", 7.23) Java Record.addField("i", Double.NaN)
API Completeness