# The Not-So-Easy C

Whenever I get an embedded software engineer's resume, one thing I can guarantee is there will be definitely a line says:

Excellent in C/C++ (or something similar)

Another one I am sure is someone will challenge the C++ skills of the candidate: so let's talk about templates...

I understand why the graduates think they are qualified for C/C++ programming, and I fully agree C++ is way more than inheritance and OOP.

What I don't understand is why people believe C is simple and, relatively, easy.

I've heard a humble interviewee said:

I am not a C++ expert, but I know C very well.

Really?

Let's test ourselves with some quick questions:

• Based on What C standard did you write your last C program? You might argue that I am showing off by asking this pedantic question. I'm not. If you are not using C99 or later, you will not be able to write inline functions because that is not supported. You will have to put all variable declarations at the top of the start of a compound statement or the compiler will complain about the "declaration may not appear after executable statement in block" error; You cannot define a variable-length array, and you will have to define your own boolean data type - these are not provided in the pre-C99 standards. There are more. This is why you need to understand what standard you are using before touching the keyboard. With the knowledge, you will solve the compilation errors easily by adding an compile option, -std=c99.
• So I assume you use C99, which I believe is the most popular standard nowadays. What does the C keyword, _restrict_, mean? No it is not _register_, I mean _restrict_. If you have no idea or are not sure about your answer, check it out here. Again one would question the point of this question: it is too esoteric, even more esoteric than the word "esoteric"! Calm down. Think about the applications of C today. There are good chances that you write C code for some embedded systems, which are very sensitive to efficiency and the footprint of the executable image. Proper use of the _restrict _keyword could help the compiler to produce significantly faster executables. If you seldom consider ways to optimize your code, you might have been lucky (or unfortunate, depending on how you look at it) to work on something called embedded but doesn't care too much about memory and efficiency, or, you might not know C very well, as an embedded software engineer.
• How would you define an integer? A question, instead of an answer is expected. One should ask: "How many bits of the integer we are talking about?" before writing on the whiteboard. And a statement "U32 x;" is better than "int x;", while "uint32_t x;" is even better because that is C standard, defined in stdint.h. "int" is evil in the embedded system industry. The type system of C has been long disputed and it is the embedded software engineers' responsibility to make it right. It is a must, but not virtue, to write clear, safe and portable code in C.

So, do you still think C is easy and you know it very well?

It would be great if you could also share your piece of not-so-easy C in the comment. My gratitude.

# 外婆

Printers are old ugly monsters from the last century. Twitter is about 6 years old and might shop for school pretty soon.

How would you put the two together? In this post twitter is used to access the printer and to request information and services.

The idea is simple. Nowadays printers can easily connected to network. By implementing a twitter connector  in the printer, the machine will understand how to communicate with the twitter server and then becomes 'twitter enabled'. The diagram below depicts the idea.

The twitter connector takes care of talking to the twitter server and translate the Tweets/DMs into printer languages.

With the printer's "twittability" enabled, one can register a twitter id as a proxy and connect the id to the printer. Once connected, the proxy will receive and send Tweets/DMs on behalf of the printer. In this way the Twitter enabled printers offer user an easier yet more powerful option to access and control:

• No extra software installation is required. Any twitter client on any platform, or the twitter webpage, will work.
• Status report is accessible in tweets even when the printer is offline.
• Printers post and "read" tweets and direct messages, which can be further customized, acting like human.

Would it be more interesting if you can talk to your printers like this?

A demo has been built to show how to talk to a twitter enabled printer:

# 11年记

• 老婆送了一把剃须刀，我开始享受刮胡子的感觉。
• 真正读了一些书。Rework, Rules of Simplicity和Being Geek都很有必要再读几次。Introduction to Algorithm依然在读，这根铁杵还得细磨下去。
• 完成了自己的第一个专利申请。没有太多的物质收入，但能在人生游戏中玩到这么一个关卡，也算是很有意思。
• 工作到现在，还是没能专精一项技术。这和大部分时间在做救火员有关，同时也要归咎于自己的毅力不足，喜欢东看看西碰碰。明年的目标是加深对内存问题和管理的理解，还有学习基本的网络知识。
• 学着平衡工作、生活和爱好。在公司就全心投入好好干活，到了家就多陪陪老婆主动帮帮忙，两头打点好了再看看书做些自己喜欢的事情。平衡很重要。
• 实现了回家两次的愿望。亲人的老去已经是事实，与其在远方不胜唏嘘还不如多陪他们吃顿饭多看会电视。
• 胜利完成婚姻一五计划。个中酸甜苦辣自知。庆祝过后，抬头看见的是更长更远更有挑战的相伴相随。
• 开始尽可能每天早上起床后跑步或是一些简单的锻炼。计划是坚持十年，看看不惑的时候会有什么收成。
• 朋友波波和大米的公司渐入佳境，迅雷哥哥也离开网易创业去了，他们为自己的梦想迈开步子去做也做到了，钦佩他们的勇气，替他们高兴；胡子和志彦分别都成了家，祝福他们一直幸福下去；还有更多的朋友也都分别结婚生子跳槽创业移民毕业。过去的一年井喷出来的是成熟与希望。
• 一位父亲的挚友因胰腺癌辞世。他一生乐观，健康生活，爱运动，学不怠，是我永远的榜样。他的提早离席，给我留下的不仅仅是难过。
• 公司再度经历重组动荡，2008年熟悉的场景和对话再次浮现。然而这次感受更多的是，真正的强者总能逆流而上。恶劣的生态往往更能让适应者生存下来，笑到最后。所以，不要怨天尤人，应该做的是提高自己适应环境。
• 认识了新的世界观。认识到人的渺小，感受到意志的伟大。有点像坐着小船开进了室外桃源：哇，原来世界如此，原来生命如此。

# Keep Your Fingers Away from "ESC"

When editing in Vim the keys that I hate most are "ESC" (Esc, Shift and CapsLock). They interrupt my thoughts when I have to find the keys far, far away from the comfort area of my fingers, not to mention the mistypes like pressing tab when I mean CapsLock.

There are many ways to get rid of these keys, below is my recipe. No guarantee you can throw the "ESC" keys away, but I would say your fingers will stay away from "ESC" for 80% of your typing experience.

In your vim configuration file, .vimrc if you are in Linux, set the mappings as:

"map space to @ (shift-2) for easier use the record buffer map <space> @

"jumping among windows, with the arrow keys nnoremap <silent> <Down> <C-W>j nnoremap <silent> <Up> <C-W>k nnoremap <silent> <Left> <C-W>h nnoremap <silent> <Right> <C-W>l

" not to even think about the <ESC> key imap jj <ESC>

"delete a word in insert mode imap jjd <ESC>diwi

"capitalize the first character of the current word, in Insert mode imap jjc <ESC>b~A

" Swap ; and :, save the Shift key. nnoremap ; : nnoremap : ;

"page down map ,j <C-d> "page up map ,k <C-u>

# An Improved Version of the Counting Sort Implementation

The "Introduction to Algorithm" provides an pseudo-implementation of counting sort.

COUNTING-SORT(A, B, k)

1  for i  ← 0 to k

2      do C[i] ← 0

3  for j ← 1 to length[A]

4      do  C[A[j]] ← C[A[j]] + 1

5  ► C[i] now contains the number of elements equal to i

6  for i  ← 1 to k

7      do C[i] ← C[i] + C[i -1]

8  ► C[i] now contains the number of elements less than or equal to i

9  for j  ← length[A] downto 1

10    do B[C[A[j]]] ← A[j]

11          C[A[j]] ← C[A[j]] - 1

It works well, but still there is room for improvement. Below is another version that makes counting sort a bit more faster:

COUNTING-SORT-2(A, B, k)

1  for i  ← 0 to k

2      do C[i] ← 0

3  for j ← 1 to length[A]

4      do  C[A[j]] ← C[A[j]] + 1

5  ► C[i] now contains the number of elements equal to i

6  n = 0

7 for i  ← 0 to k

8      for j ← 1 to C[i]

9            do B[n] ← i

10                n ← n + 1

Start from line 6 is where the two implementations differentiate. when no optimization is taken into account.

In COUNTING-SORT line 6 to line 11 takes:

k +  len(A) * 2 assignments, and k + len(A) pluses

In COUNTING-SORT-2 line 6 to line 10 takes:

len(A) * 2 assignments, and len(A) pluses

The two implementations both cost

$$Omega(len(A))$$,

but in the latter the constant k is eliminated.

At the end is a short python test that gives a taste of difference in real performance on my 2.8GHz Win7 laptop:

import time
def time_elapsed(func):
def wrapper(*arg):
t1 = time.time()
res = func(*arg)
t2 = time.time()
print("time elapsed in %s: %0.3f ms" % (func.__name__, (t2 - t1) * 1000.0))
return res
return wrapper

@time_elapsed
def counting_sort(a, k):
c = [0] * (k + 1)
for j in a:
c[j] = c[j] + 1
for i in range(1, k + 1):
c[i] = c[i] + c[i - 1]
b = a
for j in a[::-1]:
b[c[j] - 1] = j
c[j] = c[j] - 1
return b

@time_elapsed
def counting_sort_2(a, k):
c = [0] * (k + 1)
for j in a:
c[j] = c[j] + 1
b = []
for i in range(k + 1):
for j in range(c[i]):
b.append(i)
return b


Sorting with counting_sort_2:

data = [3,7,2,1,6,4,5,0] * 1000 counting_sort_2(data, 8 )

time elapsed in counting_sort_2: 22.000 ms

While doing the same sort with counting_sort:

data = [3,7,2,1,6,4,5,0] * 1000 counting_sort(data, 8 )

time elapsed in counting_sort: 36.000 ms

# 程序员老二

“很不爽。”我看见了他的脸在说话。

“还是那个样子，老板会对你说出‘别设计了，写代码吧’这样的话。因为制定计划的时候就没有打算要高质量的代码，他们只要最快速度写出能用的东西。”

“然后就开始抓人修bug，1000， 950……巴不得每个小时更新剩下的bug数目。想方设法威逼利诱底下的人加班加点解决问题，然后画一条漂亮的弧线邀功领赏。”

“工程师前期花时间想办法设计好？那你的工作报告就会很难看。老板会接受你的解释，但是考评和奖金会告诉你他的真实想法。你如果想做个好的程序员，就以最快的速度写出能工作的代码，等着测试人员报出一堆低级错误，然后轻松的快速解决掉他们，这样一来无论工作量还是效率你都优秀得无可挑剔。管别人怎么说去，潜在的问题也不是问题，过个一两年说不定你已经不用去和这些代码打交道了：恭喜！升级！”

“你呢？”我问。

“我的表现一直平平，进度最慢的是我，解决bug最少的也是我。反正没人会去关心你的设计合不合理，也没人会认为你的bug少是好事。”

“然后我就已经打算离职了，要在一个月内把手头上的项目做完。想也不多想地每天乱码一气，最后写出来的凑合能用，虽然问题一大堆。”

“然后我就接二连三地收到进步奖金、上司的鼓励、上司的上司的表扬……”

“然后我就走了”

# Encryption and Decryption - Note of Fundamental Number Theory III

In previous notes we've seen how messages are encrypted/decrypted from the top and middle level. Eventually we come to the very detail about how the helper functions in setRSAKeys(), and processMessage() are implemented.

• getPrime() in getRSAKeys() returns a prime. It randomly picks an integer and calls isPrime() to test if it is a prime. isPrime() makes use of ModularExponentiation() to perform the pseudo prime test.
int ModularExponentiation(int a, int b, int n)
{
int c = 0;
int d = 1;

/*locate the most significant bit of b*/

{
c = c * 2;
d = (d * d) % n;
if (0 != (mask & b))
{
c = c + 1;
d = (d * a) % n;
}
}

return d;
}

bool isPrime(int n)
{
/*this function requires function ModularExponentiation(),
*which assumes the input n is an odd integer greater than 2
*/

if (0 == n % 2)
{
return false;
}

if ( 1 == ModularExponentiation(2, n-1, n))
{
//This is, possibly, a prime, or to be accurate, a pseudoprime
if ( 561 == n)
return false; // the only Carmichael number in [1, 1000]
else
return true;
}
else
{
//This is definitely a composite
return false;
}
}

void getPrime(int* prime)
{
int suspect = 0;

do
{
//pick a random number
suspect = rand() % 1000;
//test if it is a prime
//if yes, return
//go to the top
} while (! isPrime(suspect));

*prime = suspect;
}


The modular exponentiation is to compute $a^b$ mod $n$, where $a$ and $b$ are nonnegative integers and $n$ is a positive integer. Here it is used to test if an input number is a prime. The test is also called primality-testing. In this note the so-called repeated squaring method, described as below, is utilized to compute the modular exponentiation:

Let $(b_k, b_{k-1}, ..., b_0)$ representing the binary form of $b$. $b_0$ is the least significant bit. ModularExponentiation(int a, int b, int n) computes $a^c$ mod $n$ as $c$ is increased by doublings, from 0 to $b$.

The function returns false when the input n is a composite, and true, if n is a pseudo-prime, which is very likely to be a prime, with a few exceptions.

The exceptions are called Carmichael numbers. They are composites and are also pseudo-primes, but they are very rare. So rare they are that for integers less than 100,000,000 there are only 255 Carmichael numbers.

In this note we make lives simple and limit the participants less than 1000, in which there is only one Carmichael number, 561.

There are of course other ways to perform perfect prime test. Miller-Rabin randomized primality test, for example, overcomes the "fake prime" problem that we have in the method above. Again to keep this note short we simply use the Modular Exponentiation method.

With a method for primality test, in the code above a random number is picked and tested. But how many numbers should we pick until a prime is found? Is there a way to estimate the density of primes given the integer range? Yes, by the Prime number theorem:

\begin{equation*} lim_{x \to \infty} \frac{ \pi(n)}{n \ln n} = 1 \end{equation*}

the number of primes equal or less than $n$ is approximately $\frac{n}{\ln n}$

So before we can find a 1024-bit long prime, we will need to test about $\ln 2^{1024} \approx 709$ numbers, and if we skip checking the evens, there are only 354 candidates to test.

Back to our little simple case, we will get a prime, oh, a pseudo-prime, after trying about $\frac{\ln 2^{9}}{2} \approx 3$ times, not bad ha?

After getting the required primes, two functions, gcd() and Extended_Euclid(), are used in setRSAKeys() to get $e$ and $d$.

The functions gcd() produces greatest common divisor of two integers. $e$, relatively prime to $\phi(n)$, is generated by this function:

int gcd(int a, int b)
{
if (b == 0)
return a;
else
return gcd(b, a % b);
}


I was thinking a simple while loop would be more efficient than the recursion version, but comparisons in C doesn't show any benefit...

After playing with some numbers and I realized that: 1. when a = b, the recursion terminates. 2. if a>b or b>a, the number of recursions is still reasonably small.

(and as it proves later in the book:(Lamé's theorem)

for any integer $k \geq 1$, if $a > b \geq 1$ and $b > F_{k+1}$,

then the call Euclid(a, b) makes fewer than k recursive calls.),

where $F_{k}$ is the Fibonacci numbers.

Exended_Euclid() are based on the Euclid's algorithm and it's extended version:

Given the input of $a$ and $b$, returns $d$, $x$ and $y$, where integers $a$, $b$, $d$, $x$, and $y$ satisfy:d = gcd(a, b) = ax+by

This function is used here to calculate modular multiplicative inverses, $d$.

  typedef struct euclid_coefficients
{
int d;
int x;
int y;
} euclid_coefficients;
euclid_coefficients ec;

euclid_coefficients Extended_Euclid(int a, int b)
{
if (b == 0)
{
ec.d = a;
ec.x   = 1;
ec.y   = 0;
return ec;
}
euclid_coefficients tmp;
tmp = Extended_Euclid(b, a % b);
int old_x = tmp.x;
tmp.x = tmp.y;
tmp.y = old_x - a/b*tmp.y;
return tmp;
}

* The last step is to use the key to encrypt/decrypt the message:

long powerMod(long x, int n, int y)
{
if (1 > n)
{
return 1;
}
if (x > y)
{
x = x % y;
}

long tmp = 1;
while ( n > 2 )
{
n -= 2;
tmp = ((x * x) % y) * tmp;
tmp %= y;
};
if (2 == n)
{
tmp = (x % y) * tmp;
}
tmp = (x % y) * tmp;
tmp %= y;

return tmp;
}

void processMessage(long* msg, key key, int len)
{
int i;

for (i = 0; i < len; i++)
{
msg[i] = powerMod(msg[i], key._key, key.n);
}
}


After all, we've built a simple encryption/decryption tool. Time to have some fun...

~/EncryptionAndDecryption/> test
p = 61, q = 881
n = 53741, phi(n) = 52800
e = 7
d = 7543
the encrypted message:
[~] [▒] [▒] [▒] []
the decrypted message:
[h] [e] [l] [l] [o]

~/EncryptionAndDecryption/>


# Encryption and Decryption - Note of Fundamental Number Theory II

The first thing we need to do in the main function is to prepare two groups of keys.

The function setRSAKeys() helps to generate a public/private key pairs:

int setRSAKeys( key* publickey, key* privatekey)
{
//generate two 512bit primes p and q
int p, q;
srand((unsigned)time(0));
getPrime(&p);
getPrime(&q);
//printf("p = %d, q = %d\n", p, q);

//calculate n and phi(n)
int n, phi_n;
n = p * q;
phi_n = (p - 1) * (q - 1);
//printf("n = %d, phi(n) = %d\n", n, phi_n);

//calculate e, a small odd relatively prime to phi_n
int e = 1;
do
{
e += 2;
} while (1 != gcd(e, phi_n));
//printf("e = %d\n", e);

//calculate d, the multiplicative inverse of e, modulo phi_n
euclid_coefficients ec ;
int d;
ec = Extended_Euclid(e, phi_n);
d = ec.x;
if (0 > d)
{
d += phi_n;
}
//printf("d = %d\n", d);

//return (n, e) and (n, d)
publickey->n = n;
publickey->_key = e;
privatekey->n = n;
privatekey->_key = d;

return 0;
}


In the function above, two primes are picked to compute $n$, which in reality is a gigantic number because $p$ and $q$ are big primes, say 1024 bits each. (again in this note we simplified the case by using small integers)

Then $e$ and $d\ =\ \phi(n)$ are obtained to form the primes: $e$ is a small odd integer and falls into the public key set. $d$, another big number calculated based on $n$, is not very easy to guess (or reversed-engineering), and is therefore put as part of the private key set.

Euler's phi function:

$\phi(n)=n\ \displaystyle\prod_{p|n} \left(1- \frac{1}{p}\right)$

where $\phi(n)$ is the size of $\mathbb{Z}_{n}^*$, $p$ runs over all the primes dividing $n$ (including $n$ itself, if $n$ is prime, too)

$\mathbb{Z}_{n}^*$ is the multiplicative group modulo n, or defined as

$\mathbb{Z}_{n}^* = {[a]_n}$ in $\mathbb{Z}_{n} : gcd(a,n) = 1$

where $[a]_{n}$ = $a'$ mod $n$, $a'$ in ${Z}$

e.g. $\mathbb{Z}_{15}^*\ =\ \{1, 2, 4, 7, 8, 11, 13, 14\}$, $\phi( 15 ) = 15 ( 1 - \frac{1}{3} ) ( 1 - \frac{1}{5} ) = 8$

When we select $p$ and $q$, which are primes and $n\ =\ pq$, The phi function becomes $\phi(n)\ =\ (p-1)(q-1)$, much easier now =)

At the end of the above function we have set two groups of keys: the public key $\{e,\ n\}$ and the private key $\{d,\ n\}$

With the public and private keys, encrypting and decrypting a message becomes easy. Given a message $M$, to encrypt $M$ we do:

$M^e\ mod\ n$

the outcome of which, $S$, is the encrypted message. To decrypt the secret,

$S^d\ mod\ n$

will give you the original message.

$d = e^{-1}\ mod\ \phi(n) = e^{-1}\ mod\ ((p-1)(q-1))$ guarantees for the given $p$, $q$ and $e$ there will be a unique $d$. Therefore for a given public key $\{ e,\ n \}$, there will be one and only one private key $\{ d,\ n \}$. This is exactly what we want!

On the other hand,

$(M^e\ mod\ n)^d\ mod\ n\ =\ M^{ed}\ mod\ n\ =\ M$

promises the encryption and decryption are irreversible operations.

To be continued...