2010年5月22日星期六

二分查找算法

JavaEye 在几天前发表了一篇关于二分查找算法的文章: 你是那10%可以实现二分查找算法的程序员吗?,讲得好像让所有的程序员都很尴尬的样子。最后文章还给出了一位 JavaScript 高手对二分查找算法的实现。但是这个算法其实也是有 bug 的。。。我们可以查看 Python 里的 bisect 模块,对于二分查找算法,分为 bisect_right 和 bisect_left 两种情况,这是由于所查找的元素在预排序算法中可能有多个存在或者元素可能会是一种复杂的数据结构,所以当然就必须考虑是查找左边的位置还是右边的位置,或者插入左边的位置还是右边的位置了。对于简单的元素不需要考虑插入左边的位置还是右边的位置,不过 Python 文档中举的例子中的元素是个复杂的数据结构,所以需要考虑这种需求。

附上那位写了《JavaScript高级程序设计》的作者使用 JavaScript 实现的代码:

//Copyright 2009 Nicholas C. Zakas. All rights reserved.
//MIT-Licensed, see source file
function binarySearch(items, value){

var startIndex = 0,
stopIndex = items.length - 1,
middle = Math.floor((stopIndex + startIndex)/2);

while(items[middle] != value && startIndex < stopIndex){

//adjust search area(调整查找范围)
if (value < items[middle]){
stopIndex = middle - 1;
} else if (value > items[middle]){
startIndex = middle + 1;
}

//recalculate middle(重新计算中项索引)
middle = Math.floor((stopIndex + startIndex)/2);
}

//make sure it's the right value(确保返回正确的值)
return (items[middle] != value) ? -1 : middle;
}

可以在 Firebug 或者其他浏览器的 JavaScript 调试器中测试这段代码, 也可以把这段代码翻译成 Python 代码,再添加一个小小的测试:

# coding=utf8
def binarySearch(items, value):

startIndex = 0
stopIndex = len(items)
middle = (stopIndex + startIndex) // 2

while items[middle] != value and startIndex < stopIndex:

#adjust search area(调整查找范围)
if value < items[middle]:
stopIndex = middle - 1
elif value > items[middle]:
startIndex = middle + 1

#recalculate middle(重新计算中项索引)
middle = (stopIndex + startIndex) // 2

#make sure it's the right value(确保返回正确的值)
return (items[middle] != value) and None or middle

if __name__ == '__main__':
items = [1, 2, 3, 4, 4, 4, 4, 4, 5, 6]
print binarySearch(items, 4)

检测运行一下,当然就发现问题了。

最后附上 Python 标准库中的实现代码:

## from file: /path/to/python-install-prefix/lib/pythonx.x/bisect.py
def bisect_right(a, x, lo=0, hi=None):
"""Return the index where to insert item x in list a, assuming a is sorted.

The return value i is such that all e in a[:i] have e <= x, and all e in
a[i:] have e > x. So if x already appears in the list, a.insert(x) will
insert just after the rightmost x already there.

Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""

if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if x < a[mid]: hi = mid
else: lo = mid+1
return lo

bisect = bisect_right # backward compatibility

def bisect_left(a, x, lo=0, hi=None):
"""Return the index where to insert item x in list a, assuming a is sorted.

The return value i is such that all e in a[:i] have e < x, and all e in
a[i:] have e >= x. So if x already appears in the list, a.insert(x) will
insert just before the leftmost x already there.

Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""

if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x: lo = mid+1
else: hi = mid
return lo

嗯,学习算法知识的话,到 Python 的标准库中查看源代码也是一种非常好的学习方式啊!

--
http://magicoding.appspot.com/entry/0
2010,04,26

没有评论:

发表评论