Running CLD as a web service **************************** .. warning:: CLD has not been vetted by a security expert. It is not safe to run it on a public server. There are a couple of differences between the desktop application and the web application. When running on the desktop, one automatically has full permissions; the internal password authentication system is not used. One also has access to the corpus manager, which allows one to switch between corpora and create new corpora. The web version only accesses a single fixed corpus. That being said, the differences have to do with configuration rather than code. The desktop application actually just runs the web application within a Python web server that it runs internally, effectively using a browser as its user interface. Local testing ------------- It is recommended to create a directory just for the CLD corpus and supporting materials, outside of the Apache document directory:: $ mkdir cld $ cd cld Create an empty corpus:: $ cld corpus.cld create One can do local testing first, before deploying. Run in webserver mode rather than desktop mode:: $ cld corpus.cld -w This will start an internal Python webserver and open a browser window pointing at localhost:8000. You should get a Login page. One cannot log in without defining users. For the sake of illustration, let us create a user named leo. Stop CLD (use ctrl-C), and do:: $ cld corpus.cld auth set leo This creates an account for leo and prompts you for a password. Notice that the password and sessions file reside in a directory called 'auth' that is a sibling to the corpus directory. That location is a configuration setting, which you may change if you desire. To see the current configuration settings:: $ cld corpus.cld config Now that we have created a user, let us restart the web server (cld corpus.cld -w) and log in using user name 'leo' and the password you chose. When you do so, you get a new page, and it should indicate, in the upper right corner, that 'leo' is logged in. But it says the corpus is not readable. When a corpus is created, no permissions are automatically granted. Stop the web server again, and make leo be an owner of the corpus:: $ cld corpus.cld perm / add leo owners The slash indicates the root directory; leo is being added to the list of owners. Permissions are inherited, so leo will be owner of any additional subdirectories that we create, unless we explicitly remove leo from the owners list of some subdirectory. Now restart the web server. Unless you clear your browser history, or wait long enough for the session to time out, leo will still be logged in, and you will now get a list of corpus contents. Creating a CGI script --------------------- One can create a CGI test script that just displays environment variables. The contents of the script:: #!/Users/abney/anaconda3/bin/python import site site.addsitedir('/Users/abney/git/seal/python') from seal.app.toplevel import test_app, Manager Manager(app=test_app).cgi() To run CLD, the CGI script should look something like the following. (The pathnames may need to be different in your environment):: #!/usr/local/bin/python import site site.addsitedir('/usr/local/seal/python') from seal.cld.toplevel import CLDManager mgr = CLDManager('/usr/local/cld/corpus.cld', auth_dir='/usr/local/cld/auth', log_file='/usr/local/cld/log', logging='all') mgr.cgi() For debugging, examine the log file. Its pathname is given in the CGI script. Configuration ------------- Configuration file .................. The configuration file may be stored in a file, or it may be provided on the command line, or as created as a dict in Python. It is passed to the App constructor. A complete list of configuration variables is provided in the section Configuration keys. One may also wish to refer to the list of Logging conditions. Password and session files .......................... To enable password protection, one requires a password file and sessions file. These are plain-text files named 'users.txt' and 'sessions.txt' in the server_dir. They should be readable by httpd, but not world-readable. **They should absolutely not be under htdocs.** The password file contains one line for each user. The fields are the user name, the salt, and the password hash value. The sessions file also contains one line for each user; its fields are user name, token, expiration, and client address. The 'auth' script can be used to manage them. Here are examples of the commands:: $ auth ls $ auth set user $ auth check user $ auth delete user All of the commands print out the locations of the password file and the sessions file. * The 'ls' command simply lists the user names * The 'set' command prompts for a (new) password, and sets the password for the user. It also deletes any active session that the user may have. * The 'check' command prompts for a password and indicates whether or not it is correct. * The 'delete' command deletes a user from both the password and sessions files.