Joining strings in Python

written by Domen Kožar, on Jun 11, 2009 2:58:00 PM.

There is already a lot of good material to read about performance. Therefore, it is best to use the following method:
    >>> print ' '.join(['The', 'fox', 'jumped', 'over', 'the', 'dog.'])
    The fox jumped over the dog.
Perfect. We have very fast string contatenation that even allows us to choose separator. Now, here is the problem:
    >>> print ', '.join(['apples', 'oranges', '', 'cocos'])
    apples, oranges, , cocos
So, if we dynamically generate string and we have a separator for a reason, we do not want this to happen. This is basically the idea:
    >>> print ', '.join(filter(None, ['apples', 'oranges', '', 'cocos']))
    apples, oranges, cocos
As you can see, filter(None, iterable) returns list of elements that evalue to True. This way we have somewhat a safe string contatenation. Real world example:
class StringBuffer:

	def __init__(self, sep=''):
		self.sep = sep
		self.output = list()

	def write(self, content):
		self.output.extend(content)

	def getvalue(self):
		return self.sep.join(filter(None, self.output))


>>> sb = StringBuffer(', ')
>>> sb.write(['apples'])
>>> sb.write(['', None])
>>> sb.write(['oranges', 'cocos'])
>>> print sb.getvalue()
apples, oranges, cocos

Comments

  • I'd rename the getvalue() to __str__(), then you can just print the instance and __str__() will be called implicitly.

    Comment by Jay Soffian — Jun 11, 2009 5:52:30 PM | # - re

  • Also, in case you're not aware, there's already a StringIO (and cStringIO) class if you need/want a file-descriptor like interface.

    Comment by Jay Soffian — Jun 11, 2009 5:55:56 PM | # - re

  • This can also be accomplished with list comprehensions:

    ', '.join([x for x in ['dude', 'wow', '', 'how', 'pow', ''] if x])

    In my opinion, the class you wrote seems tedious and unnecessary. But yes, as Jay said, you can improve it by using special methods.

    Comment by alecwh — Jun 11, 2009 7:23:00 PM | # - re

  • Also, allow line breaks in your comments! ;)

    Comment by alecwh — Jun 11, 2009 7:24:29 PM | # - re

  • __str__ idea is pretty neat, though I normally prefer to avoid magical behavior. StringIO has the same interface as StringBuffer, but is a bit slower (as shown in the links of blog entry). Also, filter function is written in C and is faster than list comprehensions. My goal is to have very fast string contatenation with filtering.

    About the comments - I'll see how powerful is Zine:)

    Comment by Domen Kožar — Jun 11, 2009 10:23:00 PM | # - re

  • Indeed, list comprehensions are slower. I would suggest using get_value() for that method name. By convention, all methods are separated by underscores.

    Comment by alecwh — Jun 12, 2009 12:55:05 AM | # - re

  • You could use ifilter from itertools or python comprehensions ', '.join(x for x in [list] if x is not None) This uses generators, that'll be (well, are) default in Python 3. It results in better performance and lower memory usage.

    Comment by deno — Jun 27, 2009 5:04:06 PM | # - re

Leave a Reply