You are currently browsing the archives for the Research category.
I’ve been told I will have a very large file to import into MySQL, 160 million rows. I don’t have the file yet but I wanted to see how my queries would work with such a database so I generated some content. It’s a database of SHA1 hashes which I will store as strings of length 32 in base 32 (26 letter alphabet and digits 2-7). In base 16 (0-9 and A-F), they would be 40 characters long.
First task is to generate the data. I first wrote some SQL queries to generate random strings (stored procedure, RAND(), array of valid characters). Based on some smaller scale tests, I estimate this would take 80 hours on an InnoDB table. Much too long, so I tried taking the SHA1() of a random number. This was estimated to take 47 hours on an InnoDB table but only 75 minutes on a MyISAM table. Apparently, this is due to InnoDB keeping a log of everything so it can roll back the transaction if something goes wrong. Still too long so another option I tried was to generate the entries with a short Python script using hashlib.sha1(str(i)).hexdigest() where i is an int between 0 and 159,999,999. This only took about 9 minutes and the total file size was 6.5 GB.
Second task is to get the data into the database. InnoDB is too slow. MyISAM took about 10 minutes to import the text file (one value per line) using the load data infile local command. This is conveniently handled via the GUI in HeidiSQL. The total database size was 22 GB, but I think I made the char field a little too big (40 characters long instead of 32).
Searching for a single value in this table took close to 4 minutes, so I added an index on the only column. The index took several hours to generate and added 7 GB to the table size. The basic select query now takes less than 0.1 seconds.
The next day, I wanted to repeat this and see if increasing the value of the myisam_sort_buffer_size variable would speed up the index creation. Unfortunately, I created an InnoDB table (by default) and didn’t change the engine to MyISAM before starting the bulk import. Several hours later, I killed HeidiSQL, then loaded the MySQL command line client, ran show full processlist and kill 2 where 2 was the process id of my import query. This didn’t stop the constant hard drive access because the InnoDB engine then started to roll back the import. Apparently, this can take even longer than the import itself and is impossible to stop even if you stop and start mysqld or the whole server. The rollback operation always continues automatically when you restart the server. This isn’t quite true and I came across a page in the documentation which explains that you can set innodb_force_recovery to the value 3 in the configuration file to prevent rollback. Then you restart the server, go to the database and drop the problematic table, delete the innodb_force_recovery setting or put it back to zero, then restart the server again. There were a few issues with the server not stopping correctly and I had to reboot a few times but it did work and I was able to access the database again.
Sorry for the lack of references.
I was disappointed to discover that IEEE Pervasive Magazine need my article as a Microsoft Word document because they don’t know how to handle LaTeX files. I don’t understand how a serious scientific society or a professional editor can not be able to handle plain text files, especially since the final typesetting probably takes place in another piece of software.
Fortunately, I found the GrindEQ LaTeX-to-Word converter, a sort of plugin for Word which allows you to open .tex files. It adds some menus to the Word ribbons (toolbars at the top of the page) and makes Word able to open .tex files from the usual File -> Open menu. When you open a LaTeX file, GrindEQ will remind you that you can only run it 10 times before having to register. It may also encourage you to install Ghostscript if you work with EPS files. Remember to set the LaTeX encoding to UTF-8 or whatever you use if you want accented characters to appear correctly.
I was very impressed with the formatting. Almost everything was correct. It’s not the prettiest Word document and it doesn’t seem to use the Word styles (eg., header 1, header 2, paragraph) but the equations and figures were mostly correct. Here are the errors that I noticed.
- In one equation, the norm symbols (double bars) had been replaced by a ‘P’ character in special font.
- A numbered list which was programmed to start from a value greater than 1 didn’t.
- The title was missing and the document started with a blank page.
- Some figures which were laid out in LaTeX using the
\subfloat command were displayed in sequence with their subcaptions. I created a table to hold these figures.
- At a few places in the text, the letters
ieeetr appeared because I was using the bibunits package with that particular style of bibliography.
- I used bibunits to create two bibliographies. LaTeX numbers them both starting from 1, but in Word they were all in one continuous sequence. This is not a problem because the reference numbers are consistent between the text and the bibliography.
In my list of references, there is a paper which is a chapter in a book. It seemed obvious that this should be a @INBOOK in Bibtex. However this entry type won’t let you specify both author and editor. You will see the Bibtex message Warning--can't use both author and editor fields in fischer11slam.
In this particular case, the paper isn’t a chapter in a book by a single author, but it’s a chapter in a collection of articles by different authors. The @INCOLLECTION entry type is the correct one to use.
The @BOOK type will also not allow both author and editor. I find this less understandable but perhaps in the case of a real book, the editor takes on a more understated role.
Large covariance matrices can be difficult to interpret, and this isn’t helped by the fact that they don’t always fit on the screen due to their size or display properly due to a wide range of values.
In Matlab, you can display a covariance matrix P as a heatmap with the command imagesc(P). This will display a grid the same size as your matrix, with the colour of each cell chosen according to the corresponding value in your matrix. With the default colour map the lowest value will be dark blue and the highest value dark red. Different colour maps can be set with the colormap command. Display the colour bar beside your figure to get a reminder of actual values. Some covariance matrices contain a mix of units and some values will be so large that they mask others. In this case you may need to manually normalise P or split it into several blocks with a consistent range of values.
I’d been wondering how to do this for ages and found the idea on one of the Mathworks forums. Thanks Warren.
Trying to get some very basic marker detection working for an indoor localisation project. After days of searching the web I still had to call on Jayson to help me out. After so many failures, it was a surprise and a relief that the solution below just worked. It’s a shame most of these AR projects seem to be out of date or undocumented.
Compiling ARToolKitPlus
Get ARToolKitPlus from Launchpad. As I write this in May 2011, the latest version is 2.2.1 released on 2011-02-05. Download the tar.bz2 archive.
tar xvjf ARToolKitPlus-2.2.1.tar.bz2
cd ARToolKitPlus-2.2.1
mkdir build
cd build
cmake ..
make
sudo make install
cmake-gui is a useful alternative to cmake if you need to change the configuration.
Sample code
There is a very simple example in sample/simple. Compile it with g++ main.cpp -o main -lARToolKitPlus. Run it from the sample directory so it can find its sample image.
Apparently the simple tracking locates markers individually, but can still locate multiple markers in each frame, and the multi tracking locates several markers that are assumed to be on the same plane and computes the equation of that plane.
Camera calibration
Calibrating the camera (to compensate for distortion) is important if you want to estimate accurate positions of the markers relative to the camera.
For camera calibration, follow the instructions from the bottom of the Studierstube website. Here are some instructions on how to use the toolbox.
- Download the Matlab calibration toolbox by Jean-Yves Bouguet.
- Take about 10 pictures of a chequerboard with the camera you wish to calibrate. Store them in the
toolbox_calib directory along with the Matlab code you just downloaded. File names must end with an incrementing number (before the file format extension). Format can be ras, bmp, tif, pgm, jpg or ppm.
- Run the
calib script.
- Choose the standard method.
- Image names or Read images seem to do the same thing. If your files are called 0.bmp, 1.bmp, 2.bmp… then hit return for the basename and enter ‘b’ for the format.
- Extract grid corners.
- Hit return without entering anything to process all images.
- Depending on the pixel size of your screen, the resolution of your camera and how accurately you can click, you may need to increase the wintx and winty parameters. These are half the size in pixels of the box within which the toolbox looks for grid corners. 20 was a good value for my 5×7 chequerboard with 40mm squares and a 640×480 camera.
- Hit return again to use the automatic square counting mechanism or enter your own values. Count the number of squares in each direction on your calibration board and subtract 2 from each (in other words, don’t count the outer squares).
- Click the four corners of the board, starting at the top left, and going clockwise.
- Enter the size of the squares on the board in millimetres. You only get asked this for the first image. If you get it wrong you need to close the toolbox and restart (maybe clear the workspace too).
- If your camera and lens distort the image a lot, the other corners in the image will not be detected properly because they aren’t spaced at predictable intervals. By entering a value between -1 and 1 for the initial distortion you can get the initial corner estimates to be closer to the real corners. I had no idea what a good value was so just tried a few until the red crosses were more or less in the right positions. Note that whether you enter an initial estimate for the distortion factor or not, the window you need to look at may be hidden behind another figure.
- The toolbox will ask you to select the four outer corners and then attempt to detect the others for all images in succession.
It shouldn’t matter too much if some of the corners aren’t detected properly. I expect they are ignored as outliers at the end.
- When you’ve detected the corners on all images, click Calibration.
- Visually check that the computed parameters make sense by clicking Undistort image. Enter 1, then the name (including number) of the image without the extension. The image should very clearly be undistorted, ie. the lines on the chequerboard should be parallel.
- Click Show calib results and use the numbers given to create a camera calibration file to use in ARToolKitPlus. The format is given on the Studierstube page:
[line1]: ARToolKitPlus_CamCal_Rev02
[line2]: xsize ysize cc_x cc_y fc_x fc_y kc1 kc2 kc3 kc3 kc5 kc6 iter
Using urllib2 to retrieve some RDF data for SPARQLWrapper (semantic web project) but getting ‘HTTP Error 409: Conflict’. This is caused by the university proxy.
The solution is to edit /usr/share/python-support/python-sparqlwrapper/SPARQLWrapper/Wrapper.py. Add these three lines at the end of the __init__ function:
proxy = urllib2.ProxyHandler({"http" : 'http://wwwcache.lancs.ac.uk:8080'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
Found the answer in a StackOverflow post by another Lancaster Uni person.
The XSens XBus Master (and inertial sensors) can be used with Bluetooth on Linux. The battery life of the XBus is reduced and there’s the possibility that the connection may be lost but it requires one cable less.
First set the XBus Master to Bluetooth mode: turn it off (3 presses of the power button), then hold the power button until the LED turns blue or purple.
Pair your computer with it using the GUI or some hcitool commands. The PIN is 0000 and the device will be called something like XM-B Sv3.5 #1.
Setup an rfcomm port to access the XBus Master as a serial device. Use hcitool scan to find its Bluetooth ID. Then run rfcomm connect 1 00:12:F3:XX:XX:XX to create a serial port at /dev/rfcomm1. Any other number than 1 is probably OK as well, including 0. You should see
eis@eis-eeepc:~$ rfcomm connect 1 00:12:F3:XX:XX:XX
Connected /dev/rfcomm1 to 00:12:F3:XX:XX:XX on channel 1
Press CTRL-C for hangup
Leave this running until you’ve finished your experiment/demo/work.
I use the CMT C++ classes to access the MTx data. In order for these classes to automatically detect the new serial port I made a small change to the cmtscan.cpp. Replace
if (strncmp("ttyS", entry->d_name, 4) == 0 || strncmp("ttyUSB", entry->d_name, 6) == 0)
with
if (strncmp("ttyS", entry->d_name, 4) == 0 || strncmp("ttyUSB", entry->d_name, 6) == 0 || strncmp("rfcomm", entry->d_name, 6) == 0)
and recompile. Alternatively create a symbolic link called /dev/ttyUSB0 (or whatever number is available) pointing to /dev/rfcomm1 (or whatever your Blutooth serial port is called).
I’ve almost finished an article for IEEE Pervasive Computing magazine. It’s been an interesting process without any major issues but here are a few notes for next time. These apply to this particular magazine article. Some tips are inappropriate for conference or journal papers and maybe even for different magazine editors.
Read more…
Having spent a long night calibrating Ubisense, here is a checklist to make it faster next time. This is with version 2.0.4 of the software. Many thanks to our resident surveying expert Yukang for staying very late to help me out with this.
Doublecheck offsets to apply to surveyed points. Get signs right. Remember which points were surveyed with the reflector pointing up or down, take into account the height of the totalstation.
Take into account the height of the antenna inside the tag. This may add 1-2 centimetres to the height of surveyed points.
Check that the antenna boards inside the sensors haven’t come loose. If they have you need to open the case and clip them back into place.
Check that the sensors are horizontal, roll should be zero.
Optimal pitch of the sensors is 45 degrees downwards. I seem to not point them down enough so should try to emphasise this in future.
Read more…
Ubisense is a real-time location system based on ultra-wide band. Their software has 2 parts, the server which can run on Windows or Linux, and the configuration UI which can only run on Windows. The server is provided as RPMs for installation on Suse or RedHat, but it’s possible to install it on Ubuntu. These instructions are for version 2.0.4 of the Ubisense server software.
Read more…