Architecture
From Colourphon
We get a request coming in over http. What do we do with it?
Contents |
Web server
Apache (or whatever you have running PHP) gets an http request and a mod rewrite rule applies the following logic.
- If the request is not for an existing file
- If the request is not for an existing directory?
- then pass the request to index.php.
Memcached
Experiments with memcache look promising. Using Memcached
Currently only caching responses from colourProfile, as this is the hog at the moment. Tried a 64MB cache, but this wasn't big enough. (scanning a 720 pixel image!!) Ramped this to 1GB cache, and page returns. (scanning a 720 pixel image!!)
cp_controller
index.php instantiates a controller object which as part of it's class constructor calls all the necessary functions.
- we parse the URL for path and parameters.
- we get the parameters into a handy array. (at each stage checking user input and discarding anything we don't like the look of.)
- we then do something based on the path.
In the Colourphon api, we expect certain things to be in certain places in the path. In a way we are using a construct like this
do what with what? and how do you want the output?
Lets take our first example: http://api.colourphon.co.uk/v1/submit/isbn?isbn=0123456789&output=html and break it down a little. What we are aiming to do is provide a way for people (servers) to request that a particular URL is scanned. Note that in this case submit is a little similar to search , int hat if we have already dealt with the ISBN, we will return you some useful datail about that from our store.
The version is always the first subdirectory from the base URL.
Followed by some sort of action (the "do")
Followed by a "What"
At this stage we know we are going to submit an isbn in order to find an image to scan, this is Colourphon we are talking about. We then have to do the "with what" part.
And finally the "how do you want your output?"
Current valid options are detailed elsewhere: link please... But include a queue control for parcelling out work to engines. (WIP)
cp
This is the base class providing some basic functionality shared by all other cp classes (except cp_scan_image and colourProfile which are standalone)
Functions to do a particular workflow, which may be called by both a html page class and some other output class. e.g:
- to scan an image
- to get bib details
- to get isbnThing results.
Things left in here may get moved later.
cp_page
handles all parts related to building pages for display in browser.
- Uses smarty for templates. currently page logic is embedded in the class, but no reason why this cannot be separated later.
Store Class
handles gets/sets to platform store. (Well that was the idea, but I'm just dropping in Moriaty where required)
cp_scan_image
handles the processing of an image, and returns an object of data/stats for subsequent storage
- What we look for:
- top n average colours - either in the whole image or in parts of the image.
- what you get
- a bunch of details about how frequent that colour is inthe image.
Currently working on a two mode cp_scan_image normal and superfast. Idea is to beable to provide a faster but less accurate service on hardware where resource is limited - or bandwidth...
cp_queue
- The queue would be queried from an engine.
- a request for a job will return an ISBN or URL to work with along with a job_id to quote.
- A request will require authentication.
- A job_id is logged as started.
- A job_id is logged as complete
cp_colours
For providing info about colours Note: currently using the RGB as a key in the store, so that eventually we could describe every colour individually. (yes over sixteen million!! hmmm...)
cp_warm_memcache
provides functionality to:
- calculate colour values from 0,0,0 to 255,255,255 (all 16.7 million odd!) and add to the memcache
- do the above in reverse
- load a csv file of already calculated data into memcache
The above are run from a command line using wee little scripts to kick it off.
cp_rdf
A class to build some rdf output for various things. Currently used to update the store with a bunch of default colours that we already know about.
Moriarty will be used for most things though
cp_ajax
A class to respond to ajax requests for get/set data.
cp_http_response
Allows us to quickly build a response with status, content type and body.
colourProfile
Rich's class to handle matching of colours.
Engine:
- Am I going to do colour profile work?
- will I have access to a memcache loaded with 16 million rgb colourisation results?
- Will I manage my own queue based on records I find, or will I just ask a master queue?
Harvester
- what am I harvesting.

