It is like the Houdini copy node

The goal of this project was to create objects at the vertices of another object. The exact same function of the copy node of Houdini (a small fraction of it, of course). As usual with coding, there are many different approaches to this problem. A quick and dirty approach is to re-evaulate the object creation code each frame, deleting and re-creating the copied objects. A cleaner solution is to copy the objects and assign expressions to them making them follow the animated mesh. Of course if the mesh is high resolution, then there will be tons of objects copied on it, eventually crashing Maya. So the most elegant way to pull this off is to instance the objects instead of copying them, and attaching expressions to each instance.

Anyhow, I took the second route, and attached expressions to the copied objects.

First I tried to access each point by using something similar to[pointnum].px. It didn't work as I expected initially. px, py, pz information stored per point on the mesh all use each point's origin, so since there isn't a point to world space conversion taking place, everything pops to the world origin, unless there is some sort of offset that tells the points otherwise. Also the px, py, pz information isn't effected by skin clusters, or any sort of deformation. The only thing effecting them is assigning keyframes to each point seperately.

Then, I decided to use the pointPosition command. Using a clever loop, I was able to assign each object that is copied an expression that would transform its position based on a point on the mesh.

I am not putting excerpts from the script here since it would make no sense at all when it is in pieces. So download the whole script instead.


Matrix assignment was a way to play around with MEL loops, and create some interesting geometry with it. I used simple for loops and nested them in each other. Creating the geometry was the easy part. Hard part was to group them and expressions on the groups to animate. 1500 cubes were created, and those each 10 consecutive cubes were placed in a group. Two nested for loops did the job. Top for loop also had the following in it: - move pivot to scene origin - random speed multiplier generation - adding animation expression to the group

I tried to keep a simple look, so the whole scene is in grayscale. creating and assigning shaders to the groups proved to be not so easy as well. the center groups were assigned a lambert that had an incandescence of 20, and the exterior environment sphere an incandescence of .2. since i used final gathering, these overly lit objects became very soft light sources. almost like huge area lights with ray traced shadows, but this renders much much faster. another thing i had to do add to the script was to turn of the default scene light.

Here is the code that added the animation expressions

for($g = 1; $g <= 1500; $g = $g + 10)   {
    for($i = 0; $i < 10; $i++)   {
        $cubeNum = $g + $i;
        $cubeName = "pCube" + $g;
        $cubeNames[$i] = "pCube" + $cubeNum;
    $groupName = "group" + $groupNum;
    group -n $groupName $cubeNames; 
    move -absolute 0 0 0 $groupName.scalePivot $groupName.rotatePivot;
    $randomSpeed = rand(5,10);
    expression -s ($groupName+".rotateY = time * " + $randomSpeed) -o $groupName -ae 1 -uc all;

And here is the code that created the center group shader and assigned it to the object

shadingNode -asShader lambert -name centerSh;
sets -renderable true -noSurfaceShader true -empty -name centerShSG;
connectAttr -f centerSh.outColor centerShSG.surfaceShader;
setAttr "centerSh.color" -type double3 1 1 1 ;
setAttr "centerSh.ambientColor" -type double3 1 1 1 ;
setAttr "centerSh.incandescence" -type double3 20 20 20 ;
select -r center ;
sets -e -forceElement centerShSG;

download matrix_1.mel


I've changed the whole code to use global procedures instead. Currently using global procedures for this specific example is completely useless since each procedure is called only once, except for the shader procedure (its called twice).

So now, the animation expressions are added by this procedure

global proc groupCrt(int $totalNumCubes, int $numCubesInGroup, int $minSpeed, int $maxSpeed) {
    int $groupNum = 1;
    string $groupName = "";
    //string $cubeNames[$numCubesInGroup];
    string $cubeNames[10];
    string $cubeName = "";
    float $randomSpeed = 0;
    for($g=1; $g<=$totalNumCubes; $g=$g+$numCubesInGroup) {
        for($i=0; $i<$numCubesInGroup; $i++) {
            $cubeNum = $g + $i;
            $cubeName = "pCube" + $g;
            $cubeNames[$i] = "pCube" + $cubeNum;
        $groupName = "group" + $groupNum;
        group -n $groupName $cubeNames;
        move -absolute 0 0 0 $groupName.scalePivot $groupName.rotatePivot;
        $randomSpeed = rand($minSpeed,$maxSpeed);
        expression -s ($groupName + ".rotateY = time * " + $randomSpeed) -o $groupName -ae 1 -uc all;

And the shaders are created and assigned by this procedure

global proc shaderCrt(string $objName, string $shaderName, vector $colorVector, vector $ambientVector, vector $incanVector) {
    $shaderNameSG = $shaderName + "SG";
    shadingNode -asShader lambert -name $shaderName;
    sets -renderable true -noSurfaceShader true -empty -name $shaderNameSG;
    connectAttr -f ($shaderName+".outColor") ($shaderNameSG+".surfaceShader");
    setAttr ($shaderName+".color") -type double3 ($colorVector.x) ($colorVector.y) ($colorVector.z);
    setAttr ($shaderName+".ambientColor") -type double3 ($ambientVector.x) ($ambientVector.y) ($ambientVector.z);
    setAttr ($shaderName+".incandescence") -type double3 ($incanVector.x) ($incanVector.y) ($incanVector.z);
    select -r $objName;
    sets -e -forceElement $shaderNameSG;

As a result, the main code is reduced down to a couple of lines

download matrix_2.mel

There are a lot of advantages of using procedures instead of hard-coding them in the main code.

One of them is that procedures can store lengthy code that will be repeated within the code. That part of the code is only written once, and then called with a single command anytime later. This saves valuable time for the coders as well as making the code more readable.

Another one is that 99.9% of programming languages start interpreting a script from top to bottom, while reading everything in the main code block. Storing repetitive tasks in procedures in this case makes the interpreter interpret the task only once, saving memory and cpu cycles. Also if a specific procedure is not needed for a specific situation, then its not read by the interpreter whatsoever, again saving valuable resources.

3D Scanner

This post is very old and full information that is all over the place. Please [ go here ] for a summary of the progress.


I have been researching 3d scanners forever now. Amount of data I collected, the number designs I have trashed, the time I spent looking for compatible parts, the time I spent choosing the programming language responsive, reliable, and flexible enough to control the hardware are way too many to count.

I finally decided on a design and started working on a prototype of a small part of the system a couple months ago. It is a very cheap proof of concept rather than a prototype, but it still is good enough for me to help predict the problems that I may face later in the building and programming phase. In fact, I ran into my first problem after I finished building the hardware and while writing the code for the point coordinate streaming into Houdini.

I need to discover the algorithms required to recreate a point cloud from distance information and stream their coordinates to Houdini. Resulting point cloud will be blurry and smudgy due to the parts used in the hardware. Infrared beam of the distance sensor is quite wide according to its specifications, and outdated controller chips that are not responsive enough. Everything aside, this will give me a starting point. In short, my goal for this project is to figure out a way to convert the distance information into 3d cartesian information and plot it in Houdini.


Here is a picture of the hardware I put together in an hour with epoxy, hot glue, and very cheap and inaccurate hardware. Just barely enough for a proof of concept.

Two servo motors are put together in a pan and tilt setup. At the end of the upper servo there is a IR distance sensor that can measures distances between 10-80cm. The servos are connected to a usb servo controller, and the IR distance sensor is connected to a USB interface board.

Servos are Hitec, usb interface and servo control boards are Phidgets, and the distance sensor controller is also Phidgets but the sensor itself is Sharp.


Random test data generation with Python was easier than I expected. So instead of trying to pipe random data into Houdini, I decided to go ahead and triangulate an actual scan. I realized that the simple trigonometry I expected to work had a fatal flaw. That is after I put hours of work into solving my problem with simple triangles.

Solution was using sphere and arc calculations instead of just triangles. The formula to recreate a point cloud of the scan is not complete yet, but it is nearly there. Once the formula is complete, I will work on piping the point cloud from the telnet port directly into Houdini instead of saving *.geo files and loading them into Houdini later. This piece of code may also be used in a later iteration of the scanner algorithm for developing a "real-time" scanning mode where the point cloud is generated on the fly while the object is still being scanned.

Since I have been working on the IR sensor since the start, all of the code is already optimized for the IR version. So I will have more time to prepare a demo.


I have tinkered around with the hardware design a bit to make the rotations from the nodal point. Parallax, and warping are still there but the results are much better.

Due to the IR noise in the environment I had to change the algorithm to take more measurements per angle and average them. (Reading thru this post much later, I realize that using a Gaussian least squares method of all the measurements per point might have been much better) As a result, performance took a huge hit, but the results are looking much better. I am looking at roughly 30 minutes per scan now.

As I stated before on an update, simple triangulation, trigonometry, and arc calculations are not working in this case. Arc calculations got close but the elegant solution is to coordinate space conversion.

I am scraping the telnet plans for now since I found a way to run the script in Houdini. To iterate, I already could run the script in houdini but it could only rotate the servos and take measurements. Now I found a way to actually use that info to create the point cloud in real-time.

Houdini Python is becoming clearer to me now. The alienating Houdini object model, and hou module are easier to control. I will make the code add an "add" node, and add points to it as they are scanned for real time Houdini scanning.

Below is the mannequin I tried scanning first. And next to iy is the resulting point cloud that i used to instance spheres on. Following is the same with a metal P-51 Mustang model.

And finally I decided to scan my own head and torse... horrible stuff on below left.

Notice how the IR beam gets diffused in hair, and how the body temperature is blurring the sensor's vision. My left eyeball is clearly visible in the image since it reflected the IR beam back, but the rest is just diffused and blurred except for the shirt.

The IR system is cheap but has some fundemental flaws such as

  • Wide beam diameter causes blurred results,
  • IR filter on the CCD doesn't like warm/hot surfaces,
  • IR beams are easily reflected from metals and other semi reflective surfaces to a completely different direction resulting in empty spots in the point cloud.

Below right is an empty scan point cloud to compare the background from all these renders.


I have spent summer experimenting with different techniques, and decided that I would use a technique called structured light scanning. Its price, ease of setup, and most importantly its performance were key factors in my decision.

My infrared scanner was limited to the refresh rate of the controller board and distance sensor I was using, and an ideal upgrade would be designing a brand new circuit using a IC called TDC-GPX, and involve a lot of low level programming. Result would be a LIDAR which would be very hard to sync with other scanner rigs in the same room due to its insane refresh rates.

One of the other techniques I experimented with was building my own distance sensor using a 1080p camera and a laser pointer. It was limited to 24 frames per second, was prone very much to image noise, and had to be calibrated very precisely due to the 1080p resolution.

The idea was that if you mount a laser pointer on a camera, parallel with camera lens, the farther laser dot gets away from the center of the recorded image, the closer the object was. Needless to say, this idea was scrapped due to the high cost of building each sensor, and sensor performance. To the left is a layered image from the results of my experiment.

This works essentially the same way the IR distance sensor worked. This works in 2 dimensions and has much higher resolution.

Other experiments I did were all with using off the shelf software. One of them that I will share some info about is called focus stacking, and the reason I want to share it is because it is an ingenious way of generating a 3D mesh. Something so simple yet I personally wouldn't think of it at all.

Documentation on the software developers web site does not specify how it works, but from my understanding it relies on edge detection. Here is a pipeline that I believe should describe the process more or less

  1. Take focus stack photos from a locked camera. Only difference between the photos should be tiny increments of focus. Camera stores focus distance along with other parameters to the EXIF portion of the image.
  2. Software analyzes the stacked images for sharpness by edge detection and assignes sharp pixels a distance value based on the EXIF focus distance. Blurry pixels are skipped.
  3. Result is essentially a z-depth map without a projection matrix.

Easy, yes. Simple, yes. Cool, incredibly. Accurate, not the slightest bit... This will not pick up anything smooth even if it is in focus, like walls, table tops, lcd displays, and etc. Also it depends directly on the resolution of the camera for its precision. For further info google helicon, or focus stacking.

I decided on structured light after I saw a demo video by Kyle Mcdonald.

Hardware required would be relatively cheap, and the code didn't seem to be too complicated either. OpenCV library initially released by Intel that is compatible with practically any imaging hardware isn't too hard to code for either. Everything seemed so simple at the beginning yet setting up the environment proved not to be.

I have been away from proper programming for far too long. Python and MEL and similar stuff does not count. Matching a proper OS with a decent IDE with access to OpenCV and its C++ bindings was not an easy decision. I jumped back and forth between Ubuntu and OSX for quite a long time. Xcode is pretty and integrated, but getting 3rd party frameworks and libraries to compile took a while to figure out. Needless to say, I decided to stay in Ubuntu for the time being, but the environment in OSX is ready and waiting. The current environment I am using is ubuntu, NetBeans, C++ and its OpenCV API.

Next thing on the list was matching hardware with the computing environment. I purchased two PS3 Eye cameras online. This is an awesome camera, it does 120fps at 320x240, and 60fps at 640x480 among other modes. It costs about ~35 usd. It is easy to modify the m12 lens mount and attach other lenses too. You can even replace the m12 mount with a CS mount. Sony does not include USB drivers for the PS3 camera, which means its drivers are supported by the open source/hacker/diy community. There is a very good document on how to set it up [ here ].


I have been sharpening my C++, trying to get used to namespaces and the syntax. I am only using includes that behave the same in all operating systems, as always, trying to keep the code platform independent.

I have the part of the code that projects the patterns and records them. 60 times every second, giving me a practical frame rate of 15 frames per second with 3 patterns and 1 white frame. Of course this can be easily increased to 120 times a second, but it would require a projector with 120Hz refresh rating as well as lower resolution of the captured images.

There is a sync issue between the camera and the projector as top half of the previous pattern can be seen in the captured images, I believe that the issue is caused by the refresh type of the dlp chip in the projector since the camera uses a global shutter CCD system. I may loose even more frames per second to compensate for it.

I am ignoring this sync issue for the time being and focusing more on the actual pattern comparison algorithm that will create the depth maps / point clouds. Found some papers on the subject by; Olaf Hall-Holt and Szymon Rusinkiewicz, Hussein Abdul-Rahman, Munther Gdeisat, David Burton and Michael Lalor, Kosuke Sato, Peisen S. Huang and Song Zhang, and a ton more. I am also looking thru Kyle Mcdonald's structured light code to get familiar with similar algorithms.


I did some more investigation and realized that this technique is stereo pair vision as well. The image sent to the projector is one of the stereo pair, and the image picked up by the camera is the second one. If I could place a camera exactly where the projector is, what the camera would pick up would be exactly the image projector would be showing. So technically, it is possible to treat a projector like a virtual camera as long as the lens parameters and the image is known.

I also realized that I don’t need another C++ framework to export the point cloud or the depth maps since OpenCV already has cv::reprojectImageTo3D function built in. It converts the disparity image into a point cloud map where the pixel RGB colors represent XYZ coordinates.

After reading a lot of stereo photographers blogs I believe I will have to adjust the distance between the cameras depending on the distance of the object being scanned. From my research I think I need to add about 4 inches of distance between the cameras for every 10 feet the object moves away to keep stereo depth as precise as possible. My theory is that since the cameras are moving now, they will have to cup to keep the stereo vision in focus as well.

For these reasons I picked up two linear actuator that will extend 4 inches, 2 tiny laser pointers, a small sonar with a 20 feet range, and two servos. I will be using my USB I/O board and USB servo controller to command these little devices. The lasers will be mounted on the cameras for the initial calibration. Cameras will be mounted on the servos to help with the focus. Servos will be mounted on the linear actuators to keep stereo disparity stable. And the linear actuators will be setup so that the smallest distance between the cameras will be 3.5-4 inches. The sonar will be measuring the distance between the object and the scanner to help the software decide the optimal camera rotations and actuator lengths.



I have the two cameras working at the same time at 120+ frames per second. They are streaming the frames while applying a gaussian blur and a canny edge detection algorithm while only using a quarter of a Core2Duo CPU. Plenty of CPU cycles left for the stereo detection and all the other stuff.

I started by just streaming clean, non-altered images. This took quite a while since I am still getting familiar with OpenCV. Part of solution was to specify the size and depth of the image variable I was assigning the capture to. Other part was to keep the grabframe, retrieveframe, etc. functions in an infinite while loop.

This allows for non-drop frame captures while taxing the processor since I can keep all the stereo functions in this loop solving all of my potential syncing problems pre-emptively. Next, I attempted to add some image alteration functions in place, and failed miserably... After some research, I realized that my problem was a combination of namespaces and again CvMat size and depth. Now the code works fine with the blur and edge detection. I am in the process of implementing a non calibrated stereo rectification algorithm from OpenCV into the code.

Python Grid Engine

I have been working on this grid engine script for some time now.

Its based on Python and as less third party libraries as possible to keep it platform independent. It works by connecting to each computer thru SSH and sending command line renders or any other job that can be distributed over SSH command line.

I know there are a lot of renderfarm scripts out there but they are all the same in principle, user needs to output renderer information files such as ifd, rib, mi2 on a workstation before sending the job to the farm. This takes a huge amount time and is very inefficient since all these files have to be distributed to the render nodes before the farm can even start rendering.

Now imagine doing this with a huge simulation thats cached to the hard drive. I usually end up with cache folders over 10 gigs and its impossible to send it to a renderfarm. My code deals with all these inefficiencies by storing everything on a network share, connecting to the render nodes thru SSH and starting the render from the shell of the renderer avoiding the creation and storing of the renderer information files frame by frame.

The following code snippet starts a child process and executes the SSH client, connects to the computer and sends in as many commands as needed:

def remotehoudinicmd(user, remotehost, cmd1, cmd2, cmd3):
    pid, fd = os.forkpty()
    if pid == 0:
        os.execv("/usr/bin/ssh", ["/usr/bin/ssh", "-l", user, remotehost] + cmd1 + cmd2 + cmd3)

And this part of the code uses the procedure above to fill in the variables to source a Houdini shell and start a render thru the hrender command:

houdinilinuxlocation = "cd /opt/hfs9.5.303" + "\n"
houdinisourcefile = "source houdini_setup" + "\n"
houdinirendercmd= "hrender -e -f " + startframe + " " + endframe + " -v -d " + rendernode + " " + filepath
print "command: " + houdinirendercmd
remotehoudinicmd(username, remotehostip,  [houdinilinuxlocation],  [houdinisourcefile],  [houdinirendercmd])

Right now I'm working on error reporting and running this process that connects to render nodes and starts renders locally. I will also code or use a third party port scanner to find SSH servers running on the local subnet.

There are other things in the To Do list as well such as storing the number of processors and amount of ram the render nodes got with their individual MAC address, and using this list to send less or more frames to different nodes to make sure render job runs as efficiently as possible. Also a procedure that will collect error data from the local jobs and re-assign that part of the job if its possible or the whole chunk again to another or the same node.

I am not releasing the code for this yet, as it is not even in alpha status. Updates to this code will be edited into this entry instead of new ones.


IP scanner is in working condition, one little problem is that the computers within the given ip range have to be online, otherwise this script takes forever to run. I'll try to implement multithreading to this procedure to make it faster.

What the script does is simple, it takes to two IP addresses, unpacks them to a tuple, takes the first value of the tuple and converts it to an integer so they can be incremented in a loop. what the loop does it to pack the IP back into its x.x.x.x form and connect to it from port 22 which is the SSH.

Loop uses connect_ex function which doesn't really connect to the server but returns an error value if it can't connect. If it can connect it returns zero. If it returns zero, the ip address gets written down to a file to be read later on by the distribution procedure, and far down the line by another SSH procedure that records the configuration of the computer to be used by the distribution procedure again. Once its done with the socket, it closes the port, unpacks the IP, converts the tuple back to an integer, increments it and loops. Here is a snippet of the procedure:

def portscan(start, stop):
    import struct
    import socket
    import os
    unpackedstart = struct.unpack('!I',socket.inet_aton(start))
    unpackedstop = struct.unpack('!I',socket.inet_aton(stop))
    unpackedstart = unpackedstart[0]
    unpackedstop = unpackedstop[0]
    while unpackedstart <= unpackedstop:
    ip = socket.inet_ntoa(struct.pack('!I', unpackedstart))
    from socket import *
    socketobj = socket(AF_INET, SOCK_STREAM)
    result = socketobj.connect_ex((ip,22))
    databasefile = open('/database', 'a')
    if result == 0:
        entry = ip + "\n"
        print ip
        import socket
        unpackedstart = struct.unpack('!I',socket.inet_aton(ip))
        unpackedstart = unpackedstart[0]
        unpackedstart = unpackedstart + 1


Time for an update, a major one. I figured threading in Python but as always hit a brick wall as soon as I worked my way around the threading problem. I am not sure if it was Python or the OS, but apparently one of them is not too big on opening the same file a few hundred times at once. So checking a large number of IP addresses at the same time was impossible. So I decided to run IP checks in batches of 50. Python script spawns 50 threads to check the network nodes, pauses for half a second, then spawns another 50 threads. I opted for the pause function instead of waiting for the thread to end and spawning a new one in its place, because it does almost the same job with a lot less code, and when your program breaks and you need to debug, you always want less, simpler code. Here is the newest port scanning part:

class portscan(Thread):
    def __init__ (self,ip,databasefile):
        self.ip = ip
        self.databasefile = databasefile
        self.status = -1
    def run(self):
        database = open(self.databasefile,'a')
        socketobj = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        result = socketobj.connect_ex((self.ip,22))
        if result == 0:
            arg1 = Popen(["ping", "-c 1", "-t 1", self.ip], stdout=PIPE)
            arg2 = Popen(["grep", "time"], stdin=arg1.stdout, stdout=PIPE)
            arg3 = Popen(["cut", "-d", "=", "-f4-"], stdin=arg2.stdout, stdout=PIPE)
            arg4 = Popen(["sed", "s/.\{3\}$//"], stdin=arg3.stdout, stdout=PIPE)
            entry = self.ip + "\n"
            print str(self.ip) +": OPEN"

And the part of the python script that loops it is here:

print "Enter IPs to start and stop scanning:"
startip = raw_input("Start IP: ")
stopip = raw_input("Stop IP: ")

databasefile = str(sys.path[0])+"/nodeDB"
print databasefile
if os.path.exists(databasefile) == 1:
unpackedstart = struct.unpack('!I',socket.inet_aton(startip))
unpackedstop = struct.unpack('!I',socket.inet_aton(stopip))
unpackedstart = unpackedstart[0]
unpackedstop = unpackedstop[0]

batchcounter = 0
divideby = 1
totalthreads = unpackedstop-unpackedstart
if totalthreads >= 50:
    divideby = totalthreads/50
batchthreads = totalthreads/divideby
batchstart = unpackedstart
batchstop = batchstart+batchthreads
while batchcounter < divideby:
    print "batchcounter" +str(batchcounter)
    while batchstart <= batchstop:
        checkip = socket.inet_ntoa(struct.pack('!I', batchstart))
        breakip = struct.unpack('!I',socket.inet_aton(checkip))
        breakip = breakip[0]
        if breakip == unpackedstop:
        threadcreate = portscan(checkip,databasefile)
        batchstart = struct.unpack('!I',socket.inet_aton(checkip))
        batchstart = batchstart[0]
        batchstart = batchstart + 1
    batchcounter = batchcounter + 1
    breakip = struct.unpack('!I',socket.inet_aton(checkip))
    breakip = breakip[0]
    if breakip >= unpackedstop:
    batchstop = batchstart + batchthreads

I also have some code to run as a threaded version of the previous Houdini render command, but my new road block is SSH authentication. Once I work around SSH problems, code should be ready to test. Only things left in the TODO list will be queue management, error checking, and adding a few more presets for other command line renderers.

Camp fire

I was doing some flame tests with Houdini before I get some real footage with a mustang in the race track for the ghost rider project... so this is what I came up with, and I know the final comp doesn't look good because I just slapped the flame on top of the instanced lights render and uploaded it but all of them together takes quite long to render and all I had at the time was an old laptop.

Anyhow, the flame effect itself is very simple. First I used a small sphere to spawn the particles which are pushed up by a simple force node with some turbulence, and some noise. It also pushes the particles that get away from the central y axis of the origin by -$TX and -$TZ parameters. They were also colored in the POP level to provide colors for the instanced lights but I ended up using the second other, bigger flame sim for it.

After I was happy with POP I copied it to metaballs and then connected them to an isooffset node. Once calibrated isooffset can convert the POP simulation which is lightning fast and the metaball copy functions into pretty much what DOP simulations would come up with, only much faster. Once done and cached to disk, I attached the basic flame shader on the simulation, add a camera, and finally render with normal micropolygonal rendering.

After the first inner flame, I increased the volume of the particle source sphere, calibrated the forces to match, copied the POP over to the metaballs, added isooffset, and cached. Once its done added the same shader again and rendered from the same camera.

Two flames comped together.

Now its time to model some woods and a wavy surface to have some sublime shadows from the approx. 350 raytrace lights i will instance (painful).

For the surface I used the softbody simulation in POPs and collided it with some other objects, froze it and exported it as a single bgeo file. Then I imported the bgeo back and put some rectangular prisms with the wood shaders on top of it.

Now, the killer function here is the instancepoint(), it calculates per particle per frame so when I add it to the point light (which is the ONLY light node in the scene by the way) which was to be instanced, every instance of the point light got the color attribute of the exact particle it was instancing per frame. Per frame meaning that the color of the particles change over time and that is transfered to the instanced lights as well.

Before I started the render I had to decrease the number of particles, so re-spawned the second set of particles, only this time with less birth attributes. This one takes a lot to render 10 hours+ for 240 frames on a core2duo 2.4ghz machine.