Thursday, August 11, 2011

Blender 2.5 Compositing from Python

If you are like me then you hate doing the same thing over and over ... which means you like to automate these kinds of processes when it makes sense to do so. I wrote a Blender plugin that imports a HiRISE DTM directly into Blender 2.5x and allows you to easily make fly-throughs of the various Mars locations that a DTM exists for.

My work wants to make automated fly throughs with a trivial piece of compositing to place a foreground and background image into the blender scene. I came across very few working examples or documentation on how to use the compositing part of Blender from python so ... here we go! I'll simply place the final example code here with comments and describe each section a little more in depth below.



# 1) Use compositing for our render, setup some paths
bpy.context.scene.use_nodes = True

fgImageLoc = "/path/to/foreground.tiff"

bgImageLoc = "/path/to/background.tiff"


# 2) Get references to the scene
Scene = bpy.context.scene
Tree = Scene.node_tree
Tree.links.remove( Tree.links[0] )


# 3) The default env will have an input and an output (Src/Dst)
Src = Tree.nodes["Render Layers"]
Dst = Tree.nodes["Composite"]


# 4) Let's create two groups to encapsulate our work
FG_Node = bpy.data.node_groups.new(
    "ForegroundImage", type='COMPOSITE')
BG_Node = bpy.data.node_groups.new(
    "BackgroundImage", type='COMPOSITE')


# 5) The foreground group has one input and one output
FG_Node.inputs.new("Source", 'RGBA')
FG_Node.outputs.new("Result", 'RGBA')


# 6) The foreground node contains an Image and an AlphaOver node
FG_Image = FG_Node.nodes.new('IMAGE')
FG_Image.image = bpy.data.images.load( fgImageLoc )
FG_Alpha = FG_Node.nodes.new('ALPHAOVER')


# 7) The Image and the Group Input are routed to the AlphaOver
#    and the AlphaOver output is routed to the group's output
FG_Node.links.new(FG_Image.outputs["Image"], FG_Alpha.inputs[2])
FG_Node.links.new(FG_Node.inputs["Source"], FG_Alpha.inputs[1])
FG_Node.links.new(FG_Node.outputs["Result"], FG_Alpha.outputs["Image"])


# 8) Add foreground image compositing to the environment
newFGGroup = Tree.nodes.new("GROUP", group = FG_Node)


# 9) Route the default render output to the input of the FG Group
Tree.links.new(newFGGroup.inputs[0], Src.outputs["Image"])


# 10) The background group has one input and one output
BG_Node.inputs.new("Source", 'RGBA')
BG_Node.outputs.new("Result", 'RGBA')


# 11) The background group contains an Image and AlphaOver node
BG_Image = BG_Node.nodes.new('IMAGE')
BG_Image.image = bpy.data.images.load( bgImageLoc )
BG_Alpha = BG_Node.nodes.new('ALPHAOVER')


# 12) Create links to internal nodes
BG_Node.links.new(BG_Image.outputs["Image"], BG_Alpha.inputs[1])
BG_Node.links.new(BG_Node.inputs["Source"], BG_Alpha.inputs[2])
BG_Node.links.new(BG_Node.outputs["Result"], BG_Alpha.outputs["Image"])


# Add background image compositing, similar to 8/9
newBGGroup = Tree.nodes.new("GROUP", group = BG_Node)
Tree.links.new(newBGGroup.inputs[0], newFGGroup.outputs[0])
Tree.links.new(newBGGroup.outputs[0], Dst.inputs["Image"])


When you run this you will end up with a pipeline that looks like this:


The rendered scene outputs to the foreground image group which then outputs to the background image group which then outputs to the specified file path in Blender. Each group is a composite of primitives in blender. When expanded (via selecting the group and pressing Tab) you will see this:


The group input and Image outputs are routed into the Alpha Over image. The Alpha Over output is routed into to the groups output. This will overlay the Image node onto the scene. A similar setup is produced for the background image.

Here is a slightly more detailed breakdown of the script:

  1. Tell blender that we have a special compositing setup
    • also, store info about where our foreground/background images are kept
  2. Blender's environment is not empty by default.
    • Get a reference to it
    • Clear the link between the Render Layer and the Composite output
  3. Get a reference to the default nodes for later use
    • Src is the source for rendered content from the scene
    • Dst is the node that takes an output and generates a file (or preview)
  4. To simplify our compositing graph, create two groups to encapsulate each function
  5. The group we just created has one input and one output
  6. Create two nodes
    • Image - acts as an output with a static image
    • AlphaOver - overlays two images using the alpha channel defined in the second image
  7. Create links between nodes.
    • It's easier to see these in the image above.
  8. Instantiate the new group in the compositing environment
    • This is where I was a little lost, the group needs to be instantiated and then a new object is returned. The input/output of the returned object are the external input/output ports of the group. If you use the previous object you will make bad connections to the internal structures of the group. Don't do it!
  9. Connect the rendered image to the foreground group input.
  10. through 12. are pretty much the same as 5. through 9.
    • Different connections make the image a background image instead of a foreground image

Thanks to Uncle_Entity and Senshi in #blendercoders@freenode for fixing my initially poor usage.

Here a few links that use the compositing above. Notice that the text hovers above the DTM while the background ... stays in the background:

Saturday, April 30, 2011

IPv6 - how to listen?

In working with IPv6 I've come to realize that there are a lot of funny defaults out there. The most basic of them seems to be in how to bind to a port. Simple? Yeah ... or so I thought.

Under some versions Linux if you are to just listen on, say, [::]:80 then people connecting to your site via IPv4 might be logged as ::ffff:192.0.32.10 which is an IPv4 mapped IPv6 address. This means that everyone is allowed to talk IPv4 but your server gets to deal with everything as if it is an IPv6 address... pretty much. DNS resolution gets a little fuzzy here but we'll sweep that issue under the carpet for the remainder of this post.

Under other versions of Linux, you may not accept IPv4 connections at all by listening on just [::]:80. As it turns out, this is a configurable default and there are (of course) arguments both ways about which is better. I personally like the one socket fits all approach but I'm also very pro-IPv6. The magic default under Linux can be found/set via "sysctl net.ipv6.bindv6only".

Stepping outside of the wonderful world of Linux and into the world of Solaris/OpenSolaris (RIP) you'll find a pretty consistent behavior towards what Linux would know as net.ipv6.bindv6only=1. In fact, I never did find a way to change the default under Solaris and had to revert to using funky configurations that specified two listening sockets: one for IPv4 and another for IPv6. In some cases this was more than a simple annoyance, it was impossible. In the case of Ejabberd, things get ugly. There is no way to specify which behavior you want and, on top of that, the connections are managed via a hash by port inside Ejabberd. That means you can't listen twice on the same socket! I hacked around this issue in our environment but I look forward to not needing it in the future.

Another place this becomes a problem is in our nginx configurations. On Solaris we have something that looks like:


server {
 listen 80;
 listen [::]:80; 
...

}

but migrating this configuration to Linux where the default is net.ipv6.bindv6only=0, we simply use:


server {
 listen   [::]:80;
 ...
}


Which does close to the same thing. Our log files may change a little since ::ffff: is now in front of our IPv4 entries but everything else pretty much stays the same.

Alternatively we can do (for the default server):


server {
 listen   80;
 listen   [::]:80 default ipv6only=on;
 ...
}


and then we are back to the kludge of using two different sockets for pretty much the same thing. There are applications where providing a different answer on IPv6 than on IPv4 makes sense but most of the time it doesn't.

What can we do as application developers to do things the right way the first time? That's highly language dependent. Some high-level languages don't distinguish between IPv4 and IPv6 unless you dig a little and ask specifically for it. The problem is that they may be compiled with or without IPv6 support (like ruby) and then you may be powerless to use IPv6 at all. Other languages you will need to make small adjustments (eg: C needs to use getaddrinfo() instead of gethostbyaddr()) Google is your friend here and be sure to checkout tools like IPv6 CARE which can tell you what is wrong as well as dynamically patch a running binary to do the right thing. Pretty slick!


Finally, what is the best practice on how to listen to an IPv6 socket? My preference is to listen once and get all traffic on the one socket but there are cases where it is desirable to use two sockets. This means the best practice is to be configurable and capable of doing both. You could make me happy and default to one socket for your application though! It makes IPv6 "just work".

Thanks for reading and Happy Hacking!

Friday, April 29, 2011

NFS over UDP, fast + cheap = not good?

From yesterday's post I took off and started researching ways that NFS over UDP can go wrong. I am now sufficiently scared away from ever writing/appending files on NFS that is using UDP. There are references everywhere to potential data corruption but only a few good sources that will give me anything concrete on the topic. Those references seem to be a bit outdated but the cautious sys-admin/engineer side of me is now screaming, "ok, fast + cheap somehow usually means not good."

Most of the cases I came across dealt with writing via UDP and not so much reading via UDP. There were some cache issues mentioned but we have run into those regardless of UDP/TCP so nothing new there. The particular use-case of the previous test does only need to read but in considering our general systems infrastructure we definitely need write functionality so UDP is probably not a good idea anymore.

Now that the NFS path has been travelled, maybe I can find a better way?

Thursday, April 28, 2011

NFS Performance: TCP vs UDP

I have found many places that will state that NFS over TCP is not appreciably slower on Linux unless you measure carefully. In fact, I found so many of them that I took it for granted for a while... until today.

Here is the backstory (circa 2005):

We are an NFS shop scaling to meet the demands of HiRISE image collection/planning/processing and we are having severe problems scaling our NFS servers to handle processing and desktop loads on our network. Turns out the final fix for some issues was to use TCP for our NFS traffic. (BTW: thanks for the pointer from our then-in-testing-vendor, Pillar) Ok, simple fix! Quick tests show that performance is about the same.

Some time after this I was working with our DBA to try and speed up a process that reads a bunch of records from the database and then verifies the first 32KiB of about a half-million files that we have published via the PDS standard. I mention that we have some shiny new SunFire T1000 servers with 24 cores which could speed this effort up via threading. He takes this to heart and threads his code so each thread deals with checking a single file at once. We did get a speedup, but definitely not 24x.


Ok, jump forward to the present day, literally today. I spec'd some hardware and put together a 5-node cluster to virtualize a bunch of operations on. Each host has 32 cores and 128GB RAM and the particular DomU we were testing with has 24 vcpus and plenty of RAM for this task. Our NFS servers are also Sun (Oracle *cough* *cough*) servers with fancy DTrace analytics which can tell you exactly what is going on with your storage. All of this should be very capable and the network in-between is rock solid and has plenty of bandwidth... so why are we only peaking around 35 reads per second? Why is this job still taking half a day to complete?

The network is not heavily loaded, the fileserver is twiddling its thumbs and the host has a load of about 60. I do some quick calculation and figure out that the computation is speeding along and processing a "whopping" 1MB of text every second (sigh.) Ok, let's point a finger at Java. It's certainly not my favorite language and as far as string processing goes, there are much better alternatives IMHO.

I gingerly ask our DBA who wrote the code if I can peruse it to see if I see anything that could be optimized. He obliges and I peruse through the code. Of course, being an abstraction on an abstraction I'm sure there is a lot to be gleaned from digging deeper but nothing pops out at me as needing to take up this entire node to process text at 1MB/s. I mention that it could be the abstractions underneath and our DBA asks why it is faster in a certain case (I haven't verified that but I believe him) and I decide, "ok, let's take Java out of the equation. import python" So here is the script I write to approximate his java class:


#!/usr/bin/env python


import sys
import threading
import Queue


class read32k(threading.Thread):
def readFileQ(self, fileQ):
self._fq = fileQ


def run(self):
while not self._fq.empty():
fn = self._fq.get()
f = open(fn)
data = f.read(32768)
f.close()


# need to join our threads later on
threads = []


# need a queue of files to look through
fileQ = Queue.Queue()
for arg in sys.argv[1:]:
fileQ.put(arg)


# initialize/start the threads
for t in range(60):
readchunk = read32k()
readchunk.readFileQ( fileQ )
readchunk.start()
threads.append( readchunk )


# wait for all threads to return
for t in threads:
t.join()


# note the number of files processed
print "read chunks of %d files" % (len( sys.argv )-1)



pretty brain-dead simple (and sloppy, sorry...) For the non-python coders but casually interested, the script starts 60 threads, and reads a 32KiB chunk of data from a queue of files that are passed from the command line. I invoke it as such:

bash# time (find . -name *.IMG | xargs /tmp/test.py)

and it takes on the order of time that the java class is taking with NFS over TCP (with much less CPU usage though...)

Ok. What gives?!? I have seen a Mac OSX host push the fileserver harder from a single threaded apps than this entire host is pushing with a multi-threaded app. Look into NFS settings, jumbo frames are enabled everywhere, nothing is popping out at me. Ok, now let's take a step back and look at the problem. No matter how many threads I use, the performance stays the same. What is it that is single-threaded and can't scale to meet these demands? It slowly occurs to me that, while TCP ensures that all packets get from client->server and back, it also ensures in-order delivery of that data. I think a little further and wonder, "What are the chances that people who posted these comments about NFS over TCP have ever done real parallel tests against a capable NFS server?"

NFS TCP Operations
As it turns out, they probably didn't. Above is a portion of the DTrace output from the fileserver while using TCP for NFS. The python script took 6 minutes to read over 17836 files (about 49 files per second.) Changing nothing else, below is a similar screenshot while using UDP instead of TCP.


Yes, that's the whole graph. I had to use my mouse quickly enough to grab the shot before it went off the screen. The same files with a new mount using UDP instead took a total of 16 seconds. We can see that latencies are much lower but we are looking at a speedup of 22.5x. The latency differences alone do not account for this speedup. While I have not dug down deeper, I feel there may be an ability to send multiple out-of-order requests via UDP that can't be currently achieved with TCP.

Ok, so the take-away message is: NFS over TCP can be much slower than NFS over UDP.

Now the real work is in front of me: "Can we re-implement NFS over UDP now that our network infrastructure is solid?" or maybe even, "Does it make sense to deal with other failure modes to gain a 22.5x improvement in speed?"

... only time will tell. Maybe we can come to the answer 22.5x faster by just implementing it across the board ;) ...

Kidding of course! (sorta)

Wednesday, November 10, 2010

IPv6, the new IPv4

I've been using IPv6 since I was asked to work on a project relating to it in 2002. At the time it was a cool "new" protocol to play with and had some really neat auto-configuration features. I deployed it in only small limited sites where I could view the old animated IPv6 turtle. More recently I have been using IPv6 in production and have seen a lot of attempts at making political grabs on IPv6. The one that amuses me the most is the technical grab at, "hey we can change things now that were problems with IPv4 before."

I have found that the majority of the time if the problem was easy to just fix, there is a solution for it already in IPv4... with the exception of global address space of course. People like to use it as a crux and have a different network layout and rules for IPv6 than they do for IPv4. The problem is that having two distinct paths to all of your hosts and a different set of rules for each path does not create a stronger whole. At best it only stays as strong as it was and the rest of the time you are making your network weaker. By weaker I am not talking just security but usability as well.

Security is paramount so I'll discuss this for a moment. Let's assume that we have a dual-stack network with 100 nodes. We have done our due diligence in IPv4 and have a firewall in place to make sure certain traffic does not flow, however historically the network has been open so we can't just close off all ports except the ones we know we have published. Now comes IPv6, the new historically un-used protocol... at least for this network. We have the opportunity to do things the way we want from scratch, right? Let's just close off all ports except the ones we know we publish in IPv6 and we will have a more secure network going forward! Hoorah! Wait a second, that doesn't help security because everything is still available in IPv4 and now you are managing two different rule sets for all of your services. Two different rule sets in itself is added complexity which means more room for human error.


Now, let's imagine we have a split administration for network management and host management. The network IT staff are now happy with their more secure IPv6 network and meanwhile the host IT staff are saying, "ah ... finally we don't need to worry about random traffic to these hosts." From this they leave their vendor defaults of no IPv6 host based firewall while having restrictive IPv4 host based firewalls. This is essentially the opposite scheme as the network folks have taken. IPv6 is wide open while IPv4 is restricted down.


I'm sure some of you white hats are starting to raise an eyebrow as you realize that any one compromised host can now access all services available via IPv6 on the local lan. What you thought you had locked down in IPv4 host based firewalls is completely open under IPv6 (but only locally). You black hats are probably thinking to yourselves, "hmm, but how common is IPv6 and is it worth my while?" Given that there are a lot of rogue IPv6 deployments from 6to4 routers and the fact that you could easily configure an entire network by sending out the right SLAAC packets ... as common as you want. Also, finding local IPv6 hosts is usually as easy as "ping6 ff02::1"


Now that I have put most of the key ingredients out there for a really nasty worm that exploits this not-so-uncommon topology, let's talk briefly about usability. A local developer decides to deploy a service on port 8080. When he is local, his client-server connections just work because IPv6 host based or site-level firewalls are not in the way. He is also developing in a high level language so he doesn't even need to know that IPv6 is being used for this to work. Later, he goes home and tries again ... except he only has IPv4. The flow might look something like this: Ok, fair enough ... the firewall is getting in the way as usual. Time to flip on the VPN. Hmm, it's still not working? I know it was working locally when I was there and I haven't changed anything so it must be the VPN. Let's send an email to our IT staff and let them debug this wretched VPN again. Maybe I will just ask for port 8080 to be opened instead. Harrumph, another night of no progress. ~insert beer here~

The next morning the IT staff says, "Oh, he needs port 8080 opened for access from home" and makes the appropriate request to the network folks. IT responds and says, "OK, we opened port 8080, give it another try." The developer, waking up a little groggy decides to go to a local coffee shop to check email and get the day started. He sees the email and decides, "alright, I'm in a coffee shop ... I'm still remote so let's test this really quickly. Hey, it works! Perfect! I can work from home tonight." Little did he know the coffee shop had an Apple Airport which sets up 6to4 for the local clients. It worked and bypassed the host based firewall again.

At this point I can keep iterating back and forth in this scenario and see that at least one more day goes by before anyone sits down and gets all of the glitches out of the connection process. All of this could have been avoided had all rules been held consistent between IPv4 and IPv6.

Now, given this one example I will say that usability is impaired and then leave it as an exercise to the reader to discover other scenarios that arise in a complicated asymmetric configuration. Solving this one problem does not solve the others. I know that sufficiently creative users manage to find all kinds of great ways in which to break the perfect models we make ... especially when we make them more complicated ... and then even more so when they don't even have to realize that more is going on below the hood than they care to know.

The bottom line is, making IPv6 significantly different in your organization from IPv4 will invariably cause problems down the road. From a users perspective, the network should just work and not get in the way.

Tuesday, February 16, 2010

Walking around the world -- compiling Gentoo Prefix

Given that I'm compiling Gentoo Prefix on a 2.3Ghz Linux host and it is taking several hours I started imaging how much work the computer is actually doing.

2.3 Ghz = 2300000000 cycles per second.

Assuming we only use 10% of the CPU because we are waiting on I/O we are still talking 230000000 cycles per second.

Each cycle represents a very small operation and perhaps only 10 little operations make up some quantum of what a human would consider an "operation" like, "take one step forward." Now we are talking about 23000000 human steps per second.

The average human stride is perhaps 34 inches... leading to 12342 miles worth of steps per second. That means that every two seconds I am asking the computer to walk around the earth. Gentoo prefix is taking several hours to bootstrap so you can figure 3600 seconds in an hour for 3 hours equates to 5363 "walks around the earth" to just bootstrap prefix once.


Oops, the bootstrap just died. I need to tweak something and restart it...


Each of those lines of code has been carefully written/re-written/debugged/versioned by human hands. A person can only write so much code in their lifetime: how many lifetimes of code are flying by me every time I press enter?

Monday, January 4, 2010

New Years Meta-Resolution

Each new year brings oodles of expectations including the dreaded/coveted "new years resolution." My resolution is actually a meta-resolution and it is "to not need resolutions next year." If I live well enough from day to day then I don't need a special date to make my life better.

Cheers!

Saturday, December 26, 2009

The Physics Factory


This is an interesting phenomenon with a bus-full of great demonstrations and wacky physicists who put on a great show and engage the audience with wonderful results! I spent a bit of time with the gang in the earliest days and have been pleased to see their successes grow.

It seems that they are starting to look at accomplishing one of their earlier goals of an actual Physics Factory in some way since Tucson should be on the market for something like this! Check out their progress and contribute if you can!
f1r2t b10g! .. ok, maybe I'm a little late in the game for that one but it's MY first anyway.

For lack of a specific topic, this will be about interesting things I come across with a small'ish splash of my own thoughts. Enjoy! ;)