Bit Manupulation

位运算是一种用位的方式操作数据的算法。低级设备控制、错误检测和修正算法、数据压缩、加密算法和优化都会用到位运算。对于其他的任务,主流语言会让程序员与抽象层交互,而不是直接操作位。那些使用位运算的操作主要使用一下四种位运算符:按位与,或,异或,非,以及移位。

位运算在某些时候可以取代或者减少对数据结构的循环,并且提供更好的性能,因为位运算都是并行的。但是,这样的代码更难编写或者维护。

细节

基础

位运算的核心在于&(与), |(或),~(非)和^(异或)以及移位运算符<<和>>。

异或操作是没有任何的布尔运算符可以解释的,但是有一种简单的解释:异或接受两个输入,如果两个输入不一样,或者说是只有一个是1的时候,返回1。异或运算的运算符是^,对每一对比特进行异或操作。异或也经常被缩写成XOR。

  • Set union集合的并集 A | B
  • Set intersection集合的交集 A & B
  • Set subtraction集合的差集 A & ~B
  • Set negation集合的补集 ALL_BITS ^ A or ~A
  • Set bit 位设定 A |= 1 << bit
  • Clear bit 位清除 A &= ~(1 << bit)
  • Test bit 位测试 (A & 1 << bit) != 0
  • Extract last bit 取最后一位 A&-A or A&~(A-1) or x^(x&(x-1))
  • Remove last bit 移去最后一位的1(位置不变) A&(A-1)
  • Get all 1-bits 获取所有位的1(int值是-1) ~0

例子

计算所给的数字中,二进制的1的个数。

int count_one(int n) {
    while(n) {
        n = n&(n-1); // 这里通过与n-1按位与,移去了最后一位。
        count++;
    }
    return count;
}

判断所给的数字是不是4的幂。

Is power of four (actually map-checking, iterative and recursive methods can do the same)

bool isPowerOfFour(int n) {
    return !(n&(n-1)) && (n&0x55555555);
    //check the 1-bit location;
}

^ tricks 异或的技巧

用^异或运算去去除偶数并留下奇数,或者保留不同的位并去除相同的位。

两个整数的和 Sum of Two Integers

用 ^ 来求两位数的和的carry,并让下一位移动。

int getSum(int a, int b) {
    return b==0? a:getSum(a^b, (a&b)<<1); //be careful about the terminating condition;
}

消失的数 Missing Number

给定一个含有不同数字的数组0,1,2……n,找到那个不在其中的数。比如 [0, 1, 3] 返回2。

当然,这个题可用数学法(高斯法)解。

int missingNumber(vector<int>& nums) {
    int ret = 0;
    for(int i = 0; i < nums.size(); ++i) {
        ret ^= i;
        ret ^= nums[i];
    }
    return ret^=nums.size();
}

| tricks 或的技巧

保留尽可能多的1。

找到一个2的幂,其不大于所给的数字N。

long largest_power(long N) {
    //changing all right side bits to 1.
    N = N | (N>>1);
    N = N | (N>>2);
    N = N | (N>>4);
    N = N | (N>>8);
    N = N | (N>>16);
    return (N+1)>>1;
}

Reverse Bits 反转二进制

将所给的32位无符号整数的二进制反转。

Solution

uint32_t reverseBits(uint32_t n) {
    unsigned int mask = 1<<31, res = 0;
    for(int i = 0; i < 32; ++i) {
        if(n & 1) res |= mask;
        mask >>= 1;
        n >>= 1;
    }
    return res;
}
uint32_t reverseBits(uint32_t n) {
	uint32_t mask = 1, ret = 0;
	for(int i = 0; i < 32; ++i){
		ret <<= 1;
		if(mask & n) ret |= 1;
		mask <<= 1;
	}
	return ret;
}

& tricks 与的技巧

就是选择某些位。

Reversing the bits in integer 反转整数中的某些位

x = ((x & 0xaaaaaaaa) >> 1) | ((x & 0x55555555) << 1);
x = ((x & 0xcccccccc) >> 2) | ((x & 0x33333333) << 2);
x = ((x & 0xf0f0f0f0) >> 4) | ((x & 0x0f0f0f0f) << 4);
x = ((x & 0xff00ff00) >> 8) | ((x & 0x00ff00ff) << 8);
x = ((x & 0xffff0000) >> 16) | ((x & 0x0000ffff) << 16);

Bitwise AND of Numbers Range 数字范围的异或

给定一个范围 [m, n] ,其中0 <= m <= n <= INT_MAX,返回所有这些数字范围内的异或值。比如说给定[5, 7],你应该返回4。

int rangeBitwiseAnd(int m, int n) {
    int a = 0;
    while(m != n) {
        m >>= 1;
        n >>= 1;
        a++;
    }
    return m<<a; 
}

Number of 1 Bits 汉明重量

写一个函数将一个整数的汉明重量返回。

一个整数的汉明重量是该数字的二进制中1的位的个数。

int hammingWeight(uint32_t n) {
	int count = 0;
	while(n) {
		n = n&(n-1);
		count++;
	}
	return count;
}
int hammingWeight(uint32_t n) {
    ulong mask = 1;
    int count = 0;
    for(int i = 0; i < 32; ++i){ //31 will not do, delicate;
        if(mask & n) count++;
        mask <<= 1;
    }
    return count;
}

Repeated DNA Sequences 重复的DNA序列

所有的DNA都是由一系列分别为A,C,G,T的碱基对构成。比如说,“ACGAATTCCG”。当在学习DNA的时候,发现其中重复的序列有时很有帮助。写一个函数来在一段DNA中找到所有的长度为10的重复DNA序列。

例子:

给定 s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

返回 : ["AAAAACCCCC", "CCCCCAAAAA"].

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        int sLen = s.length();
        vector<string> v;
        if(sLen < 11) return v;
        char keyMap[1<<21]{0};
        int hashKey = 0;
        for(int i = 0; i < 9; ++i) hashKey = (hashKey<<2) | (s[i]-'A'+1)%5;
        for(int i = 9; i < sLen; ++i) {
            if(keyMap[hashKey = ((hashKey<<2)|(s[i]-'A'+1)%5)&0xfffff]++ == 1)
                v.push_back(s.substr(i-9, 10));
        }
        return v;
    }
};

但上面的方法在序列出现太多次的时候会出问题。我们应该使用unordered_map去取代char keyMap[1<<21](0)

Majority Element 多数元素

Given an array of size n, find the majority element. The majority element is the element that appears more than ⌊ n/2 ⌋ times. (bit-counting as a usual way, but here we actually also can adopt sorting and Moore Voting Algorithm)

Solution

int majorityElement(vector<int>& nums) {
    int len = sizeof(int)*8, size = nums.size();
    int count = 0, mask = 1, ret = 0;
    for(int i = 0; i < len; ++i) {
        count = 0;
        for(int j = 0; j < size; ++j)
            if(mask & nums[j]) count++;
        if(count > size/2) ret |= mask;
        mask <<= 1;
    }
    return ret;
}

Single Number III

Given an array of integers, every element appears three times except for one. Find that single one. (Still this type can be solved by bit-counting easily.) But we are going to solve it by digital logic design

Solution

//inspired by logical circuit design and boolean algebra;
//counter - unit of 3;
//current   incoming  next
//a b            c    a b
//0 0            0    0 0
//0 1            0    0 1
//1 0            0    1 0
//0 0            1    0 1
//0 1            1    1 0
//1 0            1    0 0
//a = a&~b&~c + ~a&b&c;
//b = ~a&b&~c + ~a&~b&c;
//return a|b since the single number can appear once or twice;
int singleNumber(vector<int>& nums) {
    int t = 0, a = 0, b = 0;
    for(int i = 0; i < nums.size(); ++i) {
        t = (a&~b&~nums[i]) | (~a&b&nums[i]);
        b = (~a&b&~nums[i]) | (~a&~b&nums[i]);
        a = t;
    }
    return a | b;
}
;

Maximum Product of Word Lengths

Given a string array words, find the maximum value of length(word[i]) * length(word[j]) where the two words do not share common letters. You may assume that each word will contain only lower case letters. If no such two words exist, return 0.

Example 1: Given ["abcw", "baz", "foo", "bar", "xtfn", "abcdef"] Return 16 The two words can be "abcw", "xtfn".

Example 2: Given ["a", "ab", "abc", "d", "cd", "bcd", "abcd"] Return 4 The two words can be "ab", "cd".

Example 3: Given ["a", "aa", "aaa", "aaaa"] Return 0 No such pair of words.

Solution

Since we are going to use the length of the word very frequently and we are to compare the letters of two words checking whether they have some letters in common:

using an array of int to pre-store the length of each word reducing the frequently measuring process; since int has 4 bytes, a 32-bit type, and there are only 26 different letters, so we can just use one bit to indicate the existence of the letter in a word.

int maxProduct(vector<string>& words) {
    vector<int> mask(words.size());
    vector<int> lens(words.size());
    for(int i = 0; i < words.size(); ++i) lens[i] = words[i].length();
    int result = 0;
    for (int i=0; i<words.size(); ++i) {
        for (char c : words[i])
            mask[i] |= 1 << (c - 'a');
        for (int j=0; j<i; ++j)
            if (!(mask[i] & mask[j]))
                result = max(result, lens[i]*lens[j]);
    }
    return result;
}

Attention

result after shifting left(or right) too much is undefined right shifting operations on negative values are undefined right operand in shifting should be non-negative, otherwise the result is undefined The & and | operators have lower precedence than comparison operators Sets

All the subsets A big advantage of bit manipulation is that it is trivial to iterate over all the subsets of an N-element set: every N-bit value represents some subset. Even better, if A is a subset of B then the number representing A is less than that representing B, which is convenient for some dynamic programming solutions.

It is also possible to iterate over all the subsets of a particular subset (represented by a bit pattern), provided that you don’t mind visiting them in reverse order (if this is problematic, put them in a list as they’re generated, then walk the list backwards). The trick is similar to that for finding the lowest bit in a number. If we subtract 1 from a subset, then the lowest set element is cleared, and every lower element is set. However, we only want to set those lower elements that are in the superset. So the iteration step is just i = (i - 1) & superset.

vector<vector<int>> subsets(vector<int>& nums) {
    vector<vector<int>> vv;
    int size = nums.size(); 
    if(size == 0) return vv;
    int num = 1 << size;
    vv.resize(num);
    for(int i = 0; i < num; ++i) {
        for(int j = 0; j < size; ++j)
            if((1<<j) & i) vv[i].push_back(nums[j]);   
    }
    return vv;
}

Actually there are two more methods to handle this using recursion and iteration respectively.

Bitset

A bitset stores bits (elements with only two possible values: 0 or 1, true or false, ...). The class emulates an array of bool elements, but optimized for space allocation: generally, each element occupies only one bit (which, on most systems, is eight times less than the smallest elemental type: char).

// bitset::count
#include <iostream>       // std::cout
#include <string>         // std::string
#include <bitset>         // std::bitset

int main () {
  std::bitset<8> foo (std::string("10110011"));
  std::cout << foo << " has ";
  std::cout << foo.count() << " ones and ";
  std::cout << (foo.size()-foo.count()) << " zeros.\n";
  return 0;
}

Always welcom new ideas and practical tricks, just leave them in the comments!

来源: https://leetcode.com/problems/sum-of-two-integers/discuss/84278/A-summary%3A-how-to-use-bit-manipulation-to-solve-problems-easily-and-efficiently