Saturday, 27 April 2013

3D Printing Nylon

Oozy, webbed nylon printing
Nylons can be exceptionally strong, much more so than ABS or PLA, especially for tensile stresses. That makes them a tempting materials for 3D printing mechanical parts, so I grabbed some Taulman Nylon 618 to check it out.

At first my prints were unbelievably oozy - the nylon just flowed from the nozzle and leaked all over the print. I also had problems getting it to adhere to standard blue tape. Apparently both of these are pretty common.

Damaged bed surface after print removal
I addressed the bed-sticking problem with a cheap sheet of MDF board; 618 nylon sticks to cellulose really well. In fact, it works so well that the board's surface gets damaged each time I remove a print; ideally I'll be able to find some tape-mesh to decrease the contact area, or maybe just something less sticky, I suppose I could also just make it easy to remove the wooden bed and keep switching them out (given how cheap the stuff is).

The oozing problem was far more insidious. Initially, I tried to use retraction, but it had no effect (actually it made things worse by causing the print head to sit still momentarily, melting the plastic beneath it and letting it flow). I then happened upon a great idea; get rid of retraction entirely. Get the print head moving so quickly that not much plastic has time to flow while it's moving. In combination with the 'avoid crossing perimeters' flag in slic3r (which forces the printer to complete each part separately, getting rid of that spiderweb of inter-part trails), and the switch to a 0.2mm nozzle produced magnificent results:
A lego block being printed with very clean, ooze-free walls
Notice that only a single line of oozed filament conencts each component thanks to the 'avoid crossing perimeters' flag
I also had to significantly lower the extrusion multiplier to aid in ooze-prevention, though this may just be the result of thicker-than specified filament. I'm still tweaking the extrusion rate settings to get perfect, smooth surfaces, but the intermediate results are totally functional:
InMoov index finger - the surface is still a bit rough, but the sanded joints are slippery-smooth and its many times stronger than the ABS versions I've managed to make (which suffer from cracking and shattering near the weak, stress-bearing pivot points).
Printing at the full 245°C significantly improved the strength of printed parts and didn't seem to have any apparent adverse effects on the teflon of the Printrbot Ubis hotend (though be aware that every thermistor will have a slightly different calibration). Fortunately the Ubis hotend only includes teflon on the outside of the PEEK barrel, so it should be a few degrees cooler than the nylon inside. I didn't use any bed heating at all. Use of a cooling fan while printing turned out to have a slight detrimental effect.

Nylon loves to absorb water, water that boils and bubbles out of the extruder during printing; it's often recommended to dry it out in an oven at low heat before printing, but I haven't found that to be necessary. There's a small trail of steam from the print nozzle, but there appear to be no adverse effects, though this may not be true if you live in a high-humidity area.

I could comfortably hammer the printed lego parts without damage; a massive improvement over ABS. Despite some reports that gluing/bonding 618 is ineffective, I found that superglue worked perfectly on sanded nylon parts (see the index-finger above).

Apparently, natural nylon is very easy to colour with fabric dyes too; this allows each printed part to be coloured separately, removing the need for multiple rolls in different pre-dyed colours (though I have yet to test this).

All in all I'm quite impressed; nylon 618 is a fiddly material for printing, but the results are certainly worth it for any part that needs significant mechanical strength.

Thursday, 25 April 2013

Powersets

While looking over purported Google interview questions, I found one that was just 'write some code to make powersets'. It turns out that the powerset of a set $S$ is the set of all its subsets (including itself):
$$\mathcal{P}( S ) = \{s \subseteq S \}$$
My first attempt at a solution was an ugly iterative algorithm that used lists to represent sets. An hour or so of tinkering led me to this lazy, recursive solution using python's built in set type (which is a hash table and thus has constant-time lookups):
def subsetsOfLength(S, l):
    ''' Return a generator of all subsets of S that are of length l.

        Note: this function uses recursion, so l must be less than your python implementation's
        recursion limit (default 1000 for C Python).
    '''

    if l > len(S):
        pass # Return a length-zero iterator when subsets larger than S itself are requested.
    
    elif l == 0:
        yield set() # The empty set is the only set of length 0
        
    elif l == 1:
        for s in S:
            yield {s} # Just yield each element if only length-one elements are required
        
        # This clause isn't strictly required, but it saves len(S) redundant function calls that would all
        # return the empty set (to be unioned with x)
        
    else:
        S = S.copy() # Avoid mangling the original parameter
        
        while S:
            x = {S.pop()} # x is a set containing a single, arbitrary element from S

            # Yield the union of x with every possible subset of S\{x} (that has
            # the correct length to make the union be length l):
            for y in subsetsOfLength(S, l-1):
                yield x.union(y)

def powerset(S):
    ''' Return a generator for the powerset (the set of all [non-strict] subsets) of some set S
    '''

    for l in xrange(len(S)+1):
        for x in subsetsOfLength(S, l):
            yield x


if __name__ == '__main__': # Module Test/Demo
    print 'Length 3 subsets of S=[1, 5]'
    print '\n'.join((str(x) for x in subsetsOfLength(set(range(1, 6)), 3)))
    print
    print 'Powerset of S=[1, 3]'
    print '\n'.join((str(x) for x in powerset(set(range(1, 4)))))
    print
    print 'The length of the powerset of S with |S| = 12 is %d' % sum(1 for _ in (powerset(set(range(12)))))
        # This could be calculated a-priori, as 2^|S|, but that wouldn't be a test
Here's the test output:

Length 3 subsets of S=[1, 5]
set([1, 2, 3])
set([1, 2, 4])
set([1, 2, 5])
set([1, 3, 4])
set([1, 3, 5])
set([1, 4, 5])
set([2, 3, 4])
set([2, 3, 5])
set([2, 4, 5])
set([3, 4, 5])

Powerset of S=[1, 3]
set([])
set([1])
set([2])
set([3])
set([1, 2])
set([1, 3])
set([2, 3])
set([1, 2, 3])

The length of the powerset of S with |S| = 12 is 4096

Prime Numbers

A few months ago, during a lecture on Haskell and functional programming, we were introduced to the Sieve of Eratosthenes, an elegant way to generate prime numbers. We were shown a neat implementation in Haskell that used just a few lines; I wanted to see if I could replicate that neatness/laziness in Python.

from itertools import takewhile, chain, count, islice
from collections import deque

knownPrimes = [2] # Numbers are tested by counting upwards from here. This also acts as a cache.

def storeAndReturn(p):
    knownPrimes.append(p)
    return p

def isNotDivisibleByKnownPrimes(n):
    return all((n % p) != 0 for p in takewhile(lambda p: p**2 <= n, knownPrimes))

def primeFilter(sequence): return (storeAndReturn(i) for i in sequence if isNotDivisibleByKnownPrimes(i))

def allPrimes(): return chain(knownPrimes, primeFilter(count(knownPrimes[-1] + 1)))

def primesUpTo(limit): return takewhile(lambda p: p < limit, allPrimes())

def firstNPrimes(n): return islice(allPrimes(), None, n, None)

def nthPrime(n): return deque(firstNPrimes(n), maxlen=1).pop() # deque construction is apparently the
# fastest way to consume an iterator (partly because it's raw C)

if __name__ == '__main__': # Module Test
    print "Primes below 10:"
    print '\n'.join(map(str, primesUpTo(10)))
    
    print "First 10 primes:"
    print '\n'.join(map(str, firstNPrimes(10)))
    
    print "10,000th prime:"
    print nthPrime(10000)

I succeeded, and even added some more interesting ways to use the allPrimes() generator (although a few of the one-liners should probably be broken-out to aid readability).

Primes below 10:
2
3
5
7
First 10 primes:
2
3
5
7
11
13
17
19
23
29
10,000th prime:
104729

Neat, huh?

I love infinite generators - it's flat-out cool to be able to create an object describing every prime number using just a few lines.

Cheating (sort-of) at Draw Something

A year or so ago Draw Something was all the rage. For those who don't remember, it's basically pictionary - each turn you draw a word and your opponent/partner has to guess it from a  given set of letters, then it alternates. Usually its lots of fun, but occasionally, you get something ridiculous and unintelligible.

What?
Unwilling to accept defeat and too stubborn to ask for help, I set out to solve the problem automatically. On the assumption that bad draw something images contain basically no information, I decided to approach the problem using the text alone.

The idea: find all combinations and permutations of the given letters that were of the correct length and sort them by how frequently they are used. Initially I intended to use Google search hits to rank the letter combinations, obviously, this would be unbelievably slow and, as it turns out, violates Google's terms of service. Eventually I settled on a much better solution; use a local copy of Wikipedia. It's important to use something other than a dictionary because Draw Something frequently uses brands and pop-culture references that would't crop up in most standard dictionaries.

The article text alone was around 30GB (unzipped) when I grabbed it at the start of 2012, so lazy parsing is essential. I used python's etree module, but using it lazily turned out to be non obvious. Even when using the etree.iterparse function, all processed nodes are kept in memory forever, they're just loaded one-by-one. The answer, obvious in retrospect, is to delete each node after reading its contents using myElement.clear(). After separating each word of text, I added it to a big dictionary mapping words to their frequency-counts, that dictionary then gets pickled and saved to disk, ready for later use.

Here's the code:
from glob import glob
import cPickle as pickle
from itertools import ifilter
import xml.etree.ElementTree as etree
import string
import re
import unicodedata

wordCount = {}

identity = lambda x: x

def incrementCount(word):
    wordCount[word] = wordCount.get(word, 0) + 1

regexWordSeperators = "&[^;]+?;"
charWordSeperators = "()[]{}|.,/_=:&;\\-^~"
charSperatorTraslation = string.maketrans(charWordSeperators, " "*len(charWordSeperators))
rubbishChars = (string.punctuation + string.digits).translate(string.maketrans("",""), charWordSeperators)
def countWords(text):
    global debugcount
    
    text = unicodedata.normalize('NFKD', u'%s' % text).encode('ascii','ignore') # Much nicer conversion!
    
    text = re.sub(regexWordSeperators, " ", text).replace("%20", " ") # replace html characters like &lt; with spaces
    text = text.translate(charSperatorTraslation, rubbishChars).lower() # clean and prepare for
        # splitting (by whitespace) into words
    text = ifilter(identity, text.split()) # Remove empty strings if there are any
        
    map(incrementCount, text)

def processElement(elem, event):
    txt = None
    if event == 'end' and elem.tag[-4:] == 'text':
        txt = elem.text
    
    elem.clear() # Delete from element tree in memory!!
    root.clear() # Delete empty element from root (which
    #   has some tiny memory footprint leading to a leak that matters with HUGE files)
    return txt


# Parse XML Tree:
itertree = etree.iterparse(glob("./*.xml")[0], events=('end', 'start')) # We don't care about any events except start and end


# Define Article-Text Generator:
firstEvent, root = itertree.next()
textSnippets = ifilter(identity, (processElement(elem, event) for event, elem in itertree))


map(countWords, textSnippets) # Do the counting work here

# Store
with open("wordCount.pckl", 'wb') as f:
    pickle.dump(wordCount, f)
    
print "All of Wikipedia processed! %d words found :)" % len(wordCount)

It takes about an hour to run and the resulting file is 180MB or so; that's a lot of words! it turns out that when you have as much text as is on Wikipedia, occasional typos cause every word to permute into dozens of misspellings. Ideally a good threshold (say 5 occurrences) could be used to filter words before pickling them. Rather than waiting another hour, I just filtered the words after matching them to permutations -  it makes every search slower because it has to load and search 180MB of dictionary before filtering.

Speaking of permutations, Python's itertools library has some wonderfully simple functions for lazy combinatorics. Generating potential words of a certain length is just: permutations('abbdejmnorux', 7)

Combinatorial permutations are naturally enormous, but the fixed sizes for this problem make that insignificant on modern hardware. Here's the final code:
import cPickle as pickle
from itertools import permutations, ifilter, ifilterfalse, islice


# Parameters #

LETTERS = "abbdejmnorux"
LENGTH = 7


# Definitions #

with open("wordCount.pckl", 'rb') as f:
    wikiWordCount = pickle.load(f)

def usageFrequency(word):
    return wikiWordCount.get(word, 0)

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    # This python recipe is taken from: http://docs.python.org/2/library/itertools.html
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in ifilterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element


# Specification of Output #

candidates = unique_everseen(permutations(LETTERS, LENGTH))

candidates = (reduce(lambda x,y: x+y, candidate) for candidate in candidates)

# zip candidate strings to usage frequency, sort+filter by this value, then unzip and dump frequency info
candidates = (pair[0] for pair in sorted(
        ifilter(
            lambda pair: pair[1],
            ((candidate, usageFrequency(candidate)) for candidate in candidates)
        ),
        key = lambda pair: -pair[1]
    )
)


# Evaluation & Output #

print '\n'.join(reversed(list(islice(candidates, 0, 200))))

print "\ndone!\n"
This takes around 30 seconds to crawl though every possible mix of letters, sort them by usage frequency and print out the top few; that's pretty good!

Also:
...
moderna
jourdan
bombard
broaden
majored
dunmore
unarmed
abdomen
rebound
bermuda

done!

Rebound. The answer was rebound. Obvious now, right?

Confusing Draw Something image taken from http://baddrawsomething.tumblr.com/

This blog

For a long time I've been intending to document and share my various projects and hacks; this blog hopefully represents the end of my laziness in that regard. For a while I'm going to be retroactively documenting some recent projects, but soon, this blog should become a place where I can share progress and work though problems as I find them.

About Me

Melbourne, Victoria, Australia
I'm hard at work on an MSc in Physics at the University of Melbourne, my research topic is 'building nano-scale neural networks'. In my (limited) spare time I tinker with 3D printing, electronics and programming.