Prototype Demo and Feedback
The prototype prompts the user to specify the privacy information and recommend a replacement based on rudimentary tables. Although the dependencies were packaged in the jar, the database was not embedded and the app would blow up if a data source was not running. For the demonstration we had NetBeans running a Derby database in the background that the jar could bind to.
Q. Will we support dynamic learning of profiles?
A. The hope is that the tool will be able to dynamically learn profiles. For example, it would go through the document and ask the user when it identifies something. For example, encountering the name "Eric" might ask the user if Eric was part of an existing profile, or a new one, if it should be abstracted or changed to everything. Any further identifying text it would ask which profile the text was reflective of. In this manner it would be able to dynamically learn multiple characters on the fly.
Q. How is this more than a straight text replacement (work done for demo).
A. Discussed dead ends, incorporating prohibitively large data and work on services such as OAuth2.
Q. Will we offer expanded database incorporation through web services since the databases are too big to ship?
A. That wasn't something we had expanded, although we are looking to maintain a spring-like architecture that would scale well to a App Server model although we would need to also provide a Server jar that would convert the SanitizationResult to JSON. Alternatively, in a Client-Server model, if the customer wanted to install their own database we should be able to support (we would need to provide a client jar to be able to interpret the SanitizationResult). Also discussed dynamic SOAP calls to access WebServices for data, although this was originally out of scope. Jeff Salvage wanted to talk more about this offline, possibly via Skype.
Changes From Prototype
Currently (if we can get the embedded database working), this option ignores the needs of corporate users. Also Jeff is worried it might not be challenging enough for a Senior Design project.
- Standalone jar that can run natively on Java without a central database or web server (the Swing GUI would be the UI). This could alternatively interface with a larger database such as Freebase that the customer would install on a central server (Freebase's weekly datadumps are in excess of 250G).
- Server jar that can run natively on Java with or without a central database. If access from outside of the jar, would return the result as JSON.
- Server jar that can run natively on Java with or without a central database. A Client jar would interface with the server from multiple nodes and be responsible for sending and interpreting results as well as providing a native UI option.
- Standalone jar that can run natively on Java without a central database or web server (the Swing GUI would be the UI). SOAP calls would then allow access to external database WebService APIs.
- WebServer (JBoss?) that would interface with an SQL central database. This would mean rewriting the UI, but would also allow us to leverage many of the Analytic APIs available for a more wizzbang interface. The mentioned framework changes have been moving toward this anyway. Our deliverable would be an ear which would contain, the webserver with all jar, dependencies and war files appropriately distributed, and the database build scripts. A production environment would consist of a database server, a webserver (can be the same machine), and the clients would be web browsers on client machines. This would also require a [trivial] configuration wizard to make sure the correct database names, ports, etc get placed in the correct place. This is the most ambitious of the options. JBoss Studio developer is a modified Eclipse environment build for making JBoss development easier and you can run and deploy the webserver from the IDE.
- Represent Data as Entities that can be generated from tables. This will eliminate most of our current queries via HQL. The goal is to have them JPA-managed.
- To support JPA, we will need to use Java EE, which has transaction management built-in. The Standalone frameworks that can perform this to a limited extent are no longer supported or maintained. I believe the WYSIWYG editor in JBoss Studios is buggy on EE7, but Wildfly 8 (formerly JBoss Application Server) does support EE7. Wildfly now includes Enterprise Bean management for long-running conversations as well as native JSON support.
- OAuth2. When Google detects a user signed in to an account not the same as the OAuth2 login (mine), then it will ask the user to login. This is not the desired behavior and should be fixed.
We might want to reconsider what is public/protected. For instance, the Swing GUI could share a package with Framework for handling the Sanitization Result. The only public method would be the one to return a JSON result if they're deploying this jar in their webserver.