##热身 该Blog起源于最近和朋友讨论关于Git的一些特性, 在讨论中发现他对Git多人协作当中的模型并不是很熟悉.

如果你是Git的初学者, 建议先去Git-Learn-Branching 玩玩前2~3个Level, 了解下rebase, pull

如果该项目只有你一人开发, 那么git的pull是不会有问题的.

##背景介绍

当前, 我们有2位开发者: sunus, vivian 他们想进行pair programming一个项目. 并且该项目是由god发起的,已有2次commits. 他们会将对新的代码提交到dev分支上, 之后由god将新代码合并到稳定分支master

sunus@mbp~[/private/var/tmp/git-pull/awesome-project] (master ✔)
[22:36]:cat git.c
#include <stdio.h>

int main()
{
        printf("Hello Git!");
        return 0;
}
sunus@mbp~[/private/var/tmp/git-pull/awesome-project] (master ✔)
[22:36]:git log
commit 163a6d700226b780b7852a79fe1370a6d38c819a
Author: god <god@mbp>
Date:   Mon Dec 9 22:13:15 2013 +0800

    remove FILE

commit d22bf163d093afb494ad619d8964572e55c73167
Author: god <god@mbp>
Date:   Mon Dec 9 22:11:40 2013 +0800

    write first lines of codes

commit 45f2016a51ce7b8317e074a961647c091a50cd94
Author: sunus <god@mbp>
Date:   Mon Dec 9 22:04:41 2013 +0800

    add first file

sunus@mbp~[/private/var/tmp/git-pull/awesome-project] (master ✔)
[22:48]:git branch
dev
* master

PS, 在命令行的末端会显示我们当前所在的branch, 比如在这儿是master.

PSS, branch之后的符号是表示当前的branch是否有被修改但是还没commit的内容: ✔表示没有, ⚡表示有.

现在, sunus, vivian 他们分别将项目clone到他们的本地.

[22:52]:echo "I am sunus:)"
I am sunus:)
sunus@mbp~[/private/var/tmp/git-pull]
[22:52]:git clone awesome-project sunus
Cloning into 'sunus'...
done.
Checking connectivity... done
sunus@mbp~[/private/var/tmp/git-pull]
[22:52]:cd sunus
sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ✔)
[22:52]:ls
git.c
sunus@mbp~[/private/var/tmp/git-pull]
[22:54]:echo "I am vivian ^^"
I am vivian ^^
sunus@mbp~[/private/var/tmp/git-pull]
[22:55]:git clone awesome-project vivian
Cloning into 'vivian'...
done.
Checking connectivity... done
sunus@mbp~[/private/var/tmp/git-pull]
[22:55]:cd vivian
sunus@mbp~[/private/var/tmp/git-pull/vivian] (master ✔)
[22:55]:ls
git.c

好了, 现在开始Pair Progamming:)

####sunus 写了一些代码, 并且在本地分支有2个commits.

sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ⚡)
[23:07]:git diff
diff --git a/git.c b/git.c
index 7d26397..127e99a 100644
--- a/git.c
+++ b/git.c
@@ -2,6 +2,8 @@

 int main()
 {
+        void *p;
         printf("Hello Git!");
+        printf("I am sunus and I am here with vivian");
         return 0;
 }

sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ⚡)
[23:07]:git add git.c
sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ⚡)
[23:09]:git commit -m 'I add a intro'
[master 1c0b75b] I add a intro
 1 file changed, 1 insertion(+)
sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ✔)
[23:09]:
sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ⚡)
[23:13]:git diff
diff --git a/git.c b/git.c
index 668a8f3..50aae80 100644
--- a/git.c
+++ b/git.c
@@ -1,8 +1,16 @@
 #include <stdio.h>

+void *magic()
+{
+        return (void *)magic;
+}
+
 int main()
 {
+        void *p;
         printf("Hello Git!");
-        printf("I am sunus and I am here with vivian");
+        printf("I am sunus and I am here with vivian\n");
+        p = magic();
+        printf("I will show you a magic: %p", p);
         return 0;
 }
sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ⚡)
[23:13]:git add git.c
sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ⚡)
[23:13]:git commit -m 'show you a magic'
[master 6df90b8] show you a magic
 1 file changed, 9 insertions(+), 1 deletion(-)
sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ✔)

以下是sunus的log

sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ✔)
[23:19]:git log
commit 6df90b8dc07988fb9590100338af6897d119ca1b
Author: sunus <sunuslee@gmail.com>
Date:   Mon Dec 9 23:14:07 2013 +0800

    show you a magic

commit 1c0b75b60d68ccc58eca5519f7fd15912277be84
Author: sunus <sunuslee@gmail.com>
Date:   Mon Dec 9 23:09:41 2013 +0800

    I add a intro

commit 163a6d700226b780b7852a79fe1370a6d38c819a
Author: god <god@mbp>
Date:   Mon Dec 9 22:13:15 2013 +0800

    remove FILE

commit d22bf163d093afb494ad619d8964572e55c73167
Author: god <god@mbp>
Date:   Mon Dec 9 22:11:40 2013 +0800

    write first lines of codes

commit 45f2016a51ce7b8317e074a961647c091a50cd94
Author: god <god@mbp>
Date:   Mon Dec 9 22:04:41 2013 +0800

    add first file

####vivian 也写了一些代码, 并且在本地分支有1个commit

sunus@mbp~[/private/var/tmp/git-pull/vivian] (master ⚡)
[23:25]:git diff
diff --git a/git.c b/git.c
index 7d26397..003e1ee 100644
--- a/git.c
+++ b/git.c
@@ -3,5 +3,6 @@
 int main()
 {
         printf("Hello Git!");
+        printf("I am vivian, I am new to Programming in C:<");
         return 0;
 }
sunus@mbp~[/private/var/tmp/git-pull/vivian] (master ⚡)
[23:25]:git add git.c
sunus@mbp~[/private/var/tmp/git-pull/vivian] (master ⚡)
[23:25]:git commit -m 'vivian committttt^^'
[master 1838ec2] vivian committttt^^
 1 file changed, 1 insertion(+)
sunus@mbp~[/private/var/tmp/git-pull/vivian] (master ✔)
[23:25]:git log
commit 1838ec2b16be49b5aa084eb463e8d03e3b1f47de
Author: vivian <vivian@gmail.com>
Date:   Mon Dec 9 23:25:34 2013 +0800

    vivian committttt^^

commit 163a6d700226b780b7852a79fe1370a6d38c819a
Author: god <god@mbp>
Date:   Mon Dec 9 22:13:15 2013 +0800

    remove FILE

commit d22bf163d093afb494ad619d8964572e55c73167
Author: god <god@mbp>
Date:   Mon Dec 9 22:11:40 2013 +0800

    write first lines of codes

commit 45f2016a51ce7b8317e074a961647c091a50cd94
Author: god <god@mbp>
Date:   Mon Dec 9 22:04:41 2013 +0800

    add first file
sunus@mbp~[/private/var/tmp/git-pull/vivian] (master ✔)
[23:25]:

####现在是什么情况?

sunus, vivian都在本地基于origin上的远程分支编写了自己的代码. 但是他们不知道对方干了什么. 于是, 他们需要合并两人的修改, 并且将更新提交到远程dev分支上

vivian动作比较快, 什么也没想就push了.

sunus@mbp~[/private/var/tmp/git-pull/vivian] (master ✔)
[23:34]:git push -u origin master:dev
Counting objects: 5, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 343 bytes | 0 bytes/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To /private/var/tmp/git-pull/awesome-project
   163a6d7..1838ec2  master -> dev
Branch master set up to track remote branch dev from origin.

这看起来是成功了, god也能够看到vivian的改动:)

sunus@mbp~[/private/var/tmp/git-pull/awesome-project] (dev ✔)
[23:35]:cat git.c
#include <stdio.h>

int main()
{
        printf("Hello Git!");
        printf("I am vivian, I am new to Programming in C:<");
        return 0;
}
sunus@mbp~[/private/var/tmp/git-pull/awesome-project] (dev ✔)
[23:35]:git log
commit 1838ec2b16be49b5aa084eb463e8d03e3b1f47de
Author: vivian <vivian@gmail.com>
Date:   Mon Dec 9 23:25:34 2013 +0800

    vivian committttt^^

commit 163a6d700226b780b7852a79fe1370a6d38c819a
Author: god <god@mbp>
Date:   Mon Dec 9 22:13:15 2013 +0800

    remove FILE

接下来看以前的sunus会怎么做(他要倒霉了)

##PULL

1
git pull
该是git初学者们常用的一个操作, 他们一般认为该操作知识将本地版本库远程的版本库同步更新.

但是并不知道这背后实际发生了什么, 这也是为什么pull在大多数情况下,单个/少数开发者合作能够work, 但是在实际和多人协作中会造成问题的原因.

下面是简单的workflow:

首先, sunus并不知道origin是否有改动, 他也是直接push.

[23:42]:git push -u origin master:dev
To /private/var/tmp/git-pull/awesome-project
 ! [rejected]        master -> dev (fetch first)
error: failed to push some refs to '/private/var/tmp/git-pull/awesome-project'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first merge the remote changes (e.g.,
hint: 'git pull') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

很明显, push不成功, 因为vivian抢先一步对远程版本库做了修改. 所以, sunus看到了要先做

1
git pull
的hint.

sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ✔)
[23:47]:git pull origin dev
From /private/var/tmp/git-pull/awesome-project
 * branch            dev        -> FETCH_HEAD
Auto-merging git.c
CONFLICT (content): Merge conflict in git.c
Automatic merge failed; fix conflicts and then commit the result.
sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ⚡)
[23:48]:cat git.c
#include <stdio.h>

void *magic()
{
        return (void *)magic;
}

int main()
{
        void *p;
        printf("Hello Git!");
<<<<<<< HEAD
        printf("I am sunus and I am here with vivian\n");
        p = magic();
        printf("I will show you a magic: %p", p);
=======
        printf("I am vivian, I am new to Programming in C:<");
>>>>>>> 1838ec2b16be49b5aa084eb463e8d03e3b1f47de
        return 0;
}

好了, 接下来还是蛮常见的事情, sunus, vivian都对相关的代码做了修改, 现在有冲突了, sunus需要手动解决.

sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ✔)
[23:52]:cat git.c
#include <stdio.h>

void *magic()
{
        return (void *)magic;
}

int main()
{
        void *p;
        printf("Hello Git!");
        printf("I am sunus and I am here with vivian\n");
        p = magic();
        printf("I will show you a magic: %p", p);
        printf("I am vivian, I am new to Programming in C:<");
        return 0;
}
sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ✔)
[23:52]:git log
commit 135990d6a92554009966c7b88133501adba767f2
Merge: 6df90b8 1838ec2
Author: sunus <sunuslee@gmail.com>
Date:   Mon Dec 9 23:51:59 2013 +0800

    pull and resolved a conflict

commit 1838ec2b16be49b5aa084eb463e8d03e3b1f47de
Author: vivian <vivian@gmail.com>
Date:   Mon Dec 9 23:25:34 2013 +0800

    vivian committttt^^

ok,在这儿, sunus看了把手上的工作也做完了, 可以把代码push交到远程origin了.(会发生什么事呢?)

我们先比较一下当前sunus, vivian两人在本地的git仓库情况:

vivivan

vivian-after-push.png

sunus

sunus-before-push.png

sunus开始push.

sunus@mbp~[/private/var/tmp/git-pull/sunus] (master ✔)
[0:12]:git push -u origin master:dev
Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (9/9), 903 bytes | 0 bytes/s, done.
Total 9 (delta 2), reused 0 (delta 0)
To /private/var/tmp/git-pull/awesome-project
   1838ec2..135990d  master -> dev
Branch master set up to track remote branch dev from origin.

push成功了, 接下来, 看看当前sunus, vivian, god 本地分支的情况:

sunus

sunus-after-push.png

vivian

vivian-after-sunus-push.png

god

god-after-sunus-push.png

看起来好似没有问题, 不就是有个环吗?

但是, 尝试下

1
git log -p
会发现, 这儿根本没有 sunus push之后的详细日志, 不可思议吧?!

也就是说, 除了sunus, 别人并不知道sunusvivian他们俩的代码, 最终是如何合并的.

除非对单个commit依次进行diff

##Fetch + Rebase

让我们再来看看另一种做法, 也是我比较推荐的. 使用fetch 然后再进行rebase.

fetch: 只把origin源改动下载到本地, 但是并不进行合并.

rebase: 把当前的branch放到另一个branch的顶端, 体现的形式是开发的过程是线性的, 而不是一个环(pull/merge)

我们回到刚才sunus的情形: vivian已经push了代码.

这次, sunus使用fetch

sunus@mbp~[/private/var/tmp/git-fetch-rebase/sunus] (master ⚡)
[11:15]:git fetch origin
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From ../awesome-project
 * [new branch]      dev        -> origin/dev
 * [new branch]      master     -> origin/master

我们把新的改动下载后, 新的分支有:

  1. origin/dev 该分支有vivian的新改动
  2. origin/master 远程origin的master分支, 在这不需要理会.

接下来, 我们要做的事情是, 把我们的改动放在origin/dev分支的最顶部, 即紧接着vivian的改动. 这样看起来像一个人写的代码一样.

sunus@mbp~[/private/var/tmp/git-fetch-rebas/sunus] (master ⚡)
[11:16]:git rebase origin/dev
First, rewinding head to replay your work on top of it...
Applying: I add a intro
Using index info to reconstruct a base tree...
M	git.c
Falling back to patching base and 3-way merge...
Auto-merging git.c
CONFLICT (content): Merge conflict in git.c
Failed to merge in the changes.
Patch failed at 0001 I add a intro
The copy of the patch that failed is found in:
   /private/var/tmp/git-fetch-rebase/sunus/.git/rebase-apply/patch

When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".

sunus@mbp~[/private/var/tmp/git-fetch-rebase/sunus] ((no ⚡)
[11:16]:git mergetool
Merging:
git.c

Normal merge conflict for 'git.c':
  {local}: modified file
  {remote}: modified file
Hit return to start merge resolution tool (vimdiff):
4 files to edit

ok, 在这儿会遇到一次merge的冲突, 我们使用mergetool解决. 然后继续rebase

sunus@mbp~[/private/var/tmp/git-fetch-rebase/sunus] ((no ⚡)
[11:18]:git rebase --continue
Applying: I add a intro
Applying: show you a magic
Using index info to reconstruct a base tree...
M	git.c
Falling back to patching base and 3-way merge...
Auto-merging git.c
CONFLICT (content): Merge conflict in git.c
Failed to merge in the changes.
Patch failed at 0002 show you a magic
The copy of the patch that failed is found in:
   /private/var/tmp/git-fetch-rebase/sunus/.git/rebase-apply/patch

When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".

sunus@mbp~[/private/var/tmp/git-fetch-rebase/sunus] ((no ⚡)
[11:18]:git mergetool
Merging:
git.c

Normal merge conflict for 'git.c':
  {local}: modified file
  {remote}: modified file
Hit return to start merge resolution tool (vimdiff):
4 files to edit
sunus@mbp~[/private/var/tmp/git-fetch-rebase/sunus] ((no ⚡)
[11:19]:cat git.c
#include <stdio.h>

void *magic()
{
        return (void *)magic;
}

int main()
{
        void *p;
        printf("Hello Git!");
        printf("I am sunus and I am here with vivian\n");
        p = magic();
        printf("I will show you a magic: %p", p);
        printf("I am vivian, I am new to Programming in C:<");
        return 0;
}
sunus@mbp~[/private/var/tmp/git-fetch-rebase/sunus] ((no ⚡)
[11:19]:git rebase --continue
Applying: show you a magic

ok, rebase完成, 可以看到最后sunus的2个commit: 5580/8318 是在当前log的最顶端.

sunus@mbp~[/private/var/tmp/git-fetch-rebase/sunus] (master ✔)
[11:19]:git log
commit 5580978c60d157da68816644aba7afecd328a4be
Author: sunus <sunuslee@gmail.com>
Date:   Mon Dec 9 23:14:07 2013 +0800

    show you a magic

commit 8318c499e6d5c4d9fd9ba46c19994c326a6cb1c5
Author: sunus <sunuslee@gmail.com>
Date:   Mon Dec 9 23:09:41 2013 +0800

    I add a intro

commit 1838ec2b16be49b5aa084eb463e8d03e3b1f47de
Author: vivian <vivian@gmail.com>
Date:   Mon Dec 9 23:25:34 2013 +0800

    vivian committttt^^

commit 163a6d700226b780b7852a79fe1370a6d38c819a
Author: god <god@mbp>
Date:   Mon Dec 9 22:13:15 2013 +0800

    remove FILE

接下来, 把我们本地的变动提交到远程仓库.

sunus@mbp~[/private/var/tmp/git-fetch-rebase/sunus] (master ✔)
[11:32]:git push -u origin master:dev
Counting objects: 8, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (6/6), 663 bytes | 0 bytes/s, done.
Total 6 (delta 1), reused 0 (delta 0)
To ../awesome-project
   1838ec2..5580978  master -> dev
Branch master set up to track remote branch dev from origin.

我们看看sunus, god当前的历史情况.

sunus

sunus-push-after-rebase.png

god

god-after-sunus-rebase-push.png

ok, 看起来很不错!

然后看看vivian需要做什么获取最新的代码

sunus@mbp~[/private/var/tmp/git-fetch-rebase/vivian] (master ✔)
[11:56]:git fetch
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
From ../awesome-project
 * [new branch]      dev        -> origin/dev
 * [new branch]      master     -> origin/master
sunus@mbp~[/private/var/tmp/git-fetch-rebase/vivian] (master ✔)
[11:57]:git diff master origin/dev
diff --git a/git.c b/git.c
index 003e1ee..0187070 100644
--- a/git.c
+++ b/git.c
@@ -1,8 +1,17 @@
 #include <stdio.h>

+void *magic()
+{
+        return (void *)magic;
+}
+
 int main()
 {
+        void *p;
         printf("Hello Git!");
+        printf("I am sunus and I am here with vivian\n");
+        p = magic();
+        printf("I will show you a magic: %p", p);
         printf("I am vivian, I am new to Programming in C:<");
         return 0;
 }
sunus@mbp~[/private/var/tmp/git-fetch-rebase/vivian] (master ✔)
[11:57]:git log
commit 1838ec2b16be49b5aa084eb463e8d03e3b1f47de
Author: vivian <vivian@gmail.com>
Date:   Mon Dec 9 23:25:34 2013 +0800

    vivian committttt^^

commit 163a6d700226b780b7852a79fe1370a6d38c819a
Author: god <god@mbp>
Date:   Mon Dec 9 22:13:15 2013 +0800

    remove FILE

sunus@mbp~[/private/var/tmp/git-fetch-rebase/vivian] (master ✔)
[11:57]:git merge origin/dev
Updating 1838ec2..5580978
Fast-forward
 git.c | 9 +++++++++
 1 file changed, 9 insertions(+)

sunus@mbp~[/private/var/tmp/git-fetch-rebase-bak/vivian] (master ✔)
[11:58]:git log
commit 5580978c60d157da68816644aba7afecd328a4be
Author: sunus <sunuslee@gmail.com>
Date:   Mon Dec 9 23:14:07 2013 +0800

    show you a magic

commit 8318c499e6d5c4d9fd9ba46c19994c326a6cb1c5
Author: sunus <sunuslee@gmail.com>
Date:   Mon Dec 9 23:09:41 2013 +0800

    I add a intro

commit 1838ec2b16be49b5aa084eb463e8d03e3b1f47de
Author: vivian <vivian@gmail.com>
Date:   Mon Dec 9 23:25:34 2013 +0800

    vivian committttt^^

commit 163a6d700226b780b7852a79fe1370a6d38c819a
Author: god <god@mbp>
Date:   Mon Dec 9 22:13:15 2013 +0800

    remove FILE

嗯, vivian这边也没什么问题, 也同步了本地的版本库.

最后看看她本地的历史:

vivian-after-sunus-rebase

嗯, 看起来好极了~

##总结

  • 如果你只是一个人在开发一个项目, 并且在第三方托管(比如github) 那么不管是使用pull还是/fetch rebase都不会有太大问题, 而且pull还是更方便
    • github的pull request也是通过将他人的改动, 放到当前历史的最顶端来解决这个问题.
  • 如果是多人合作的话, 大部分情况下pull是不会有问题的, 但是会照成合并之后的版本日志混乱, 开发的过程混乱.(因为版本的修改记录是一个环)

  • 所以最好还是, 多fetch, 多rebase.这样, 版本的记录是能够保持线性的, 并且每次改动都能在日志里看得很明白.

##Update:

经好友@hunt提醒, 对本文提出了一点看法(关于merge与rebase):

内核在更顶端的地方的开发者/维护者使用的是 merge,比如 Linus 合并网络模块 maintainer David Miller 的 tree(Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net), David Miller 合并 OpenvSwitch 维护者 Jesse Gross 的 tree(Merge branch ‘master’ of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch)。而在更为下游的地方,比如 OpenvSwitch 社区中,提交给内核模块的代码则是要求开发者使用 rebase 来形成一个线性的提交。这样子形成了一个 非常好的分工,Jesse Gross 负责 OpenvSwitch 的模块代码的维护,David Miller 则轻松地进行合并,并关注 net 模块核心的一些相关的改动,Linus 同样能轻松地合并 net 模块中的内容,只需要去关注主干树上对基础代码的改动。对 开发者来讲,也很容易能明白哪些代码应该提交到哪个列表中,并抄送改动涉及/波及到的相关列表。

##Lab2

这个Lab主要是编写内核内存管理的代码:

  1. 初始化物理内存, 并实现物理页的alloc free.
  2. 建立页目录, 页表的两层结构并设置权限.
  3. 编写页基本操作的代码, 创建 / 查询 / 插入 / 删除.
  4. 按需求初始化各个段的内存, 分配 kernel / user 的权限.

接下来, 我们一点一点的说完成该Lab需要注意的一些地方.

###Part 1

Physical Page Management

The operating system must keep track of which parts of physical RAM are free and which are currently in use. JOS manages the PC’s physical memory with page granularity so that it can use the MMU to map and protect each piece of allocated memory.

You’ll now write the physical page allocator. It keeps track of which pages are free with a linked list of struct PageInfo objects, each corresponding to a physical page. You need to write the physical page allocator before you can write the rest of the virtual memory implementation, because your page table management code will need to allocate physical memory in which to store page tables.

很明显, 这是需要我们写一个物理内存的分配器, 鉴于当前是物理内存, 所以分配的逻辑非常简单.

维护一个单链表 page_free_list

  • 当分配页面的时候, 从链表中移除链表头作为新的页.
struct PageInfo *
page_alloc(int alloc_flags)
{
	// Fill this function in
	// SUNUS, 23, October, 2013
	struct PageInfo *pp = page_free_list;
	if (!pp)
		return NULL;
	page_free_list = page_free_list->pp_link;
	if (alloc_flags & ALLOC_ZERO)
		memset(page2kva(pp), '\0', PGSIZE);
	pp->pp_link = NULL;
	return pp;
}
  • 当移除页面的时候, 把移除的页面加入到 page_free_list中.
    • 只有页面的引用计数为0时, 才会调用page_free
void
page_free(struct PageInfo *pp)
{
	// Fill this function in
	// SUNUS, 23, October, 2013
	assert(pp->pp_ref == 0);
	pp->pp_link = page_free_list;
	page_free_list = pp;
}

###Part 2

这部分是本Lab的重点, 首先. 需要了解 Linear address , Virtual address , Physical address 目前, 可以认为在接下来开启了分页机制之后, Linear addressVirtual address 是一回事.

在Linux下, 每个进程都有自己独立的地址空间, 32bit的系统下位4GB. 所以, 每个地址的长度都是四字节, 也正好是一个指针的大小. 在了解了Linux的分页机制之后, 可以看到一个Virtual address其实是由如下3个部分组成:

// A linear address 'la' has a three-part structure as follows:
//
// +--------10------+-------10-------+---------12----------+
// | Page Directory |   Page Table   | Offset within Page  |
// |      Index     |      Index     |                     |
// +----------------+----------------+---------------------+
//  \--- PDX(la) --/ \--- PTX(la) --/ \---- PGOFF(la) ----/
//  \---------- PGNUM(la) ----------/

页目录(Page directory)其实是一个长度为1024的整形数组, 里面的每个元素是指向每一个页表(Page table)的指针. 每个页表也是个长度为1024的整形数组, 里边的元素则是物理地址的值.

然一个虚拟地址的高10位是该地址对应的页目录索引, 用于获取页目录中指向该地址的页表的地址. 通过10~20位, 能够得到该地址在页表项的索引, 然后就能够得到该地址对应的物理地址, 最后, 虚拟地址的低12位加上物理地址的基地址. 就完成了由虚拟地址到物理地址的转换.

当前, 我们需要做的就是建立一个这样的页表以及页目录. 并且能够实现由虚拟地址到物理地址的转换. 具体可以参考 github code: kern/pmap.c

###Part 3

Initializing the Kernel Address Space 已经差不多了, 接下来我们需要初始化内存空间. 这里需要注意的就是在对pgdir, pte进行赋值之后. 在以后的读取操作时, 需要把低位的标志位mask掉. 不然会读取到错误的地址(原地址|flags)

###Done Lab2就此结束, 本Lab2虽然说的毕竟少. 但是其实代码量还是挺大的. 特别是指针类型和uint32_t类型的互相转换使用, 务必非常小心.

##Exercise #12##

这次是Lab1的最后一个练习, 也牵涉到蛮多知识. 所以单独开一篇POST来总结.

Exercise 12. Modify your stack backtrace function to display, for each eip, the function name, source file name, and line number corresponding to that eip.

Add a backtrace command to the kernel monitor, and extend your implementation of mon_backtrace to call debuginfo_eip and print a line for each stack frame of the form:

K> backtrace
Stack backtrace:
ebp f010ff78  eip f01008ae  args 00000001 f010ff8c 00000000 f0110580 00000000
     kern/monitor.c:143: monitor+106
ebp f010ffd8  eip f0100193  args 00000000 00001aac 00000660 00000000 00000000
     kern/init.c:49: i386_init+59
ebp f010fff8  eip f010003d  args 00000000 00000000 0000ffff 10cf9a00 0000ffff
     kern/entry.S:70: <unknown>+0
K>

Each line gives the file name and line within that file of the stack frame’s eip, followed by the name of the function and the offset of the eip from the first instruction of the function (e.g., monitor+106 means the return eip is 106 bytes past the beginning of monitor).

Be sure to print the file and function names on a separate line, to avoid confusing the grading script.

You may find that some functions are missing from the backtrace. For example, you will probably see a call to monitor() but not to runcmd(). This is because the compiler in-lines some function calls. Other optimizations may cause you to see unexpected line numbers. If you get rid of the -O2 from GNUMakefile, the backtraces may make more sense (but your kernel will run more slowly).

本次练习的描述如上, 主要需要我们做这么一件事情. 编写一个 backtrace 函数(命令). 让我们能够在命令行通过 backtrace 命令来显示出当前栈帧的情况. 这对于调试是很有帮助的. 该 traceback 主要会输出执行到指令调用的文件, 行号, 函数名等信息, 很像 gdb 里的 where 指令.

我们首先, 最好把 GNUMakefile 文件里的编译优化选项设置为 -O0 来禁止优化, 为了防止我们一些函数名, 大部分是 __inline__ 的被优化掉.

其次, 阅读文件 kern/kdebug.c. 里边主要是有这么两个函数

    
static void stab_binsearch(const struct Stab *stabs, int *region_left, int *region_right,
        int type, uintptr_t addr);

// debuginfo_eip(addr, info)
//
//	Fill in the 'info' structure with information about the specified
//	instruction address, 'addr'.  Returns 0 if information was found, and
//	negative if not.  But even if it returns negative it has stored some
//	information into '*info'.
//
int debuginfo_eip(uintptr_t addr, struct Eipdebuginfo *info);

1
debuginfo_eip
的函数看起来很简单, 也比较好理解. 传入参数 addr 即是需要查询的指令的地址(eip). 返回的信息则存入info中.

但是如何获取那些信息呢? 则需要从 符号表(stabs) 来搜索获取了. 函数

1
stab_binsearch
做的这是这么一件事情, 搜索符号表, 返回结果. 但是也并不像看起来的那么简单.

先说下符号表, 他在C程序内存中的表现是一个结构体数组. 每一个entry都是这样一个结构

// Entries in the STABS table are formatted as follows.
struct Stab {
	uint32_t n_strx;	// index into string table of name
	uint8_t n_type;         // type of symbol
	uint8_t n_other;        // misc info (usually empty)
	uint16_t n_desc;        // description field
	uintptr_t n_value;	// value of symbol
}

每一个entry根据类型 n_type的不同, 它的成员 n_desc, n_value 都会表示不同的意思. 这儿有文档说明对应的类型, 每个成员表示的意义. 主要看看 N_SO, N_FUN, N_SLINE 并且, 符号表每个entry的排列也是有一定规律的, 可以认为是这样. 架设只考虑类型 N_SO, N_FUN, N_SLINE

N_SO  file_1-start ...
...
N_FUN function_1-in-file-1-start .....
....
N_SLINE line_a-in-function_1
...
N_SLINE line_b-in-function_1
...
N_FUN function_2-in-file-1-start .....
...
N_SLINE line_c-in-function_2
...
N_SO file_2-start ......
....

N_FUN function_3-in-file-2-start .....
....
N_SO file_3-start ...

也就是说, 每个元素都存在一个包含的关系. 有点类似 HTML DOM , 通过对 stabs_binsearch指定类型, 我们是可以逐步缩小范围, 搜索到需要的entry的.

然后说下

1
debuginfo_eip

    int
    debuginfo_eip(uintptr_t addr, struct Eipdebuginfo *info)
    {
    	const struct Stab *stabs, *stab_end;
    	const char *stabstr, *stabstr_end;
    	int lfile, rfile, lfun, rfun, lline, rline;

    	// Initialize *info
    	info->eip_file = "<unknown>";
    	info->eip_line = 0;
    	info->eip_fn_name = "<unknown>";
    	info->eip_fn_namelen = 9;
    	info->eip_fn_addr = addr;
    	info->eip_fn_narg = 0;
    
        stabs_fix();

    	// Find the relevant set of stabs
    	if (addr >= ULIM) {
    		stabs = __STAB_BEGIN__;
    		stab_end = __STAB_END__;
    		stabstr = __STABSTR_BEGIN__;
    		stabstr_end = __STABSTR_END__;
    	} else {
                    // Can't search for user-level addresses yet!
      	        panic("User address");
    	}

    	// String table validity checks
    	if (stabstr_end <= stabstr || stabstr_end[-1] != 0)
    		return -1;

    	// Now we find the right stabs that define the function containing
    	// 'eip'.  First, we find the basic source file containing 'eip'.
    	// Then, we look in that source file for the function.  Then we look
    	// for the line number.

    	// Search the entire set of stabs for the source file (type N_SO).
    	lfile = 0;
    	rfile = (stab_end - stabs) - 1;
    	stab_binsearch(stabs, &lfile, &rfile, N_SO, addr);
    	if (lfile == 0)
    		return -1;

    	// Search within that file's stabs for the function definition
    	// (N_FUN).
    	lfun = lfile;
    	rfun = rfile;
    	stab_binsearch(stabs, &lfun, &rfun, N_FUN, addr);

    	if (lfun <= rfun) {
    		// stabs[lfun] points to the function name
    		// in the string table, but check bounds just in case.
    		if (stabs[lfun].n_strx < stabstr_end - stabstr)
    			info->eip_fn_name = stabstr + stabs[lfun].n_strx;
    		info->eip_fn_addr = stabs[lfun].n_value;
    		addr -= info->eip_fn_addr;
    		// Search within the function definition for the line number.
    		lline = lfun;
    		rline = rfun;
    	} else {
    		// Couldn't find function stab!  Maybe we're in an assembly
    		// file.  Search the whole file for the line number.
    		info->eip_fn_addr = addr;
    		lline = lfile;
    		rline = rfile;
    	}
    	// Ignore stuff after the colon.
    	info->eip_fn_namelen = strfind(info->eip_fn_name, ':') - info->eip_fn_name;
    	// Search within [lline, rline] for the line number stab.
    	// If found, set info->eip_line to the right line number.
    	// If not found, return -1.
    	//
    	// Hint:
    	//	There's a particular stabs type used for line numbers.
    	//	Look at the STABS documentation and <inc/stab.h> to find
    	//	which one.
    	// Your code here.
    	// SUNUS, 2013-10-09
    	stab_binsearch(stabs, &lline, &rline, N_SLINE, addr);
    	info->eip_line = stabs[lline].n_desc;
    	// Search backwards from the line number for the relevant filename
    	// stab.
    	// We can't just use the "lfile" stab because inlined functions
    	// can interpolate code from a different file!
    	// Such included source files use the N_SOL stab type.
    	while (lline >= lfile
    	       && stabs[lline].n_type != N_SOL
    	       && (stabs[lline].n_type != N_SO || !stabs[lline].n_value))
    		lline--;
    	if (lline >= lfile && stabs[lline].n_strx < stabstr_end - stabstr)
    		info->eip_file = stabstr + stabs[lline].n_strx;


    	// Set eip_fn_narg to the number of arguments taken by the function,
    	// or 0 if there was no containing function.
    	if (lfun < rfun)
    		for (lline = lfun + 1;
    		     lline < rfun && stabs[lline].n_type == N_PSYM;
    		     lline++)
    			info->eip_fn_narg++;

    	return 0;
    }

搜索的主要流程为:

  1. 搜索出对应eip的文件范围. 通过类型 N_SO, 搜索范围为整个stabs表.该步可以获得指令对应的文件名
  2. 通过1可以得到在该文件范围内的所有指令的一个子集, 通过类型 N_FUN 在该范围内搜索指令对应的函数, 该步可以获得函数名称, 函数地址等.
  3. 通过2可以得到在该函数范围内所有资料的一个子集, 通过类型 N_SLINE 可以搜索到对应指令的源文件中的行号 N_SLINE.

看起来还是很简单, 可是我在具体实现的过程中被一个我认为是外部因素的问题干扰了. 原因如下. 首先, 我的环境下,

1
stabs_binsearch
是可以搜索到正确的文件名, 这没有问题. 但是, 在搜索函数对应的entry时则出错. 导致后边的运行结果也都是错的.

具体调试过程如下:

make qemu-nox-gdb

#Open anoter terminal to get the stabs directly from kernel for later exam.

objdump -G obj/kern/kernel > stabs

#Open another terminal to run gdb to debug.

gdb
b debuginfo_eip
c
Breakpoint 1, debuginfo_eip (addr=4027580555, info=0xf0110edc) at kern/kdebug.c:137
137     info->eip_file = "<unknown>";
(gdb) where
#0  debuginfo_eip (addr=4027580555, info=0xf0110edc) at kern/kdebug.c:137
#1  0xf0100b8b in mon_backtrace (argc=0, argv=0x0, tf=0x0) at kern/monitor.c:70
#2  0xf010008b in test_backtrace (x=0) at kern/init.c:18
#3  0xf010006d in test_backtrace (x=1) at kern/init.c:16
#4  0xf010006d in test_backtrace (x=2) at kern/init.c:16
#5  0xf010006d in test_backtrace (x=3) at kern/init.c:16
#6  0xf010006d in test_backtrace (x=4) at kern/init.c:16
#7  0xf010006d in test_backtrace (x=5) at kern/init.c:16
#8  0xf01000f1 in i386_init () at kern/init.c:39
#9  0xf010003e in relocated () at kern/entry.S:80

gdb显示的traceback是正确的, 可以看到准确的函数调用过程.

在debuginfo_eip内, 运行第一次搜索的代码是(搜索源文件).

1
2
stab_binsearch(stabs, &lfile, &rfile, N_SO, addr);
# lfile == 69 , rfile == 155

然后对照之前得到的stabs表进行检查.(有删减) 由于该表是由-1开始索引, 索引实际返回的范围是[68, 154]

obj/kern/kernel:     file format elf32-i386

Contents of .stab section:

Symnum n_type n_othr n_desc n_value  n_strx String

-1     HdrSym 0      1519   00001d04 1     
0      SO     0      0      f0100000 1      {standard input}
1      SOL    0      0      f010000c 18     kern/entry.S
..
62     LSYM   0      145    00000000 2622   pde_t:t(3,2)=(2,9)
63     EINCL  0      0      00000000 0      
64     GSYM   0      33     00000000 2641   entry_pgtable:G(0,19)=ar(0,20)=r(0,20);0;037777777777;;0;1023;(3,1)
65     GSYM   0      21     00000000 2709   entry_pgdir:G(0,21)=ar(0,20);0;1023;(3,2)
66     SO     0      0      f0100040 0      
67     SO     0      2      f0100040 31     /home/sunus/myProjects/6828/jos/
68     SO     0      2      f0100040 2751   kern/init.c
...
88     BINCL  0      0      00000000 2763   ./inc/stdio.h
89     BINCL  0      0      00000650 2777   ./inc/stdarg.h
90     LSYM   0      6      00000000 2792   va_list:t(2,1)=(2,2)=*(0,2)
91     EINCL  0      0      00000000 0      
92     EINCL  0      0      00000000 0      
93     BINCL  0      0      00000000 2820   ./inc/string.h
94     EXCL   0      0      00005d17 839    ./inc/types.h
95     EINCL  0      0      00000000 0      
96     FUN    0      12     f0100040 2835   test_backtrace:F(0,18)
97     PSYM   0      12     00000008 2858   x:p(0,1)
98     BNSYM  0      0      f0100040 0      
99     SLINE  0      13     00000000 0      
100    SLINE  0      14     00000006 0      
101    SLINE  0      15     00000019 0      
102    SLINE  0      16     0000001f 0      
103    SLINE  0      18     0000002f 0      
104    SLINE  0      19     0000004b 0      
105    SLINE  0      20     0000005e 0      
106    FUN    0      0      00000060 0      
107    ENSYM  0      0      f01000a0 0      
108    FUN    0      23     f01000a0 2867   i386_init:F(0,18)
109    BNSYM  0      0      f01000a0 0      
110    SLINE  0      24     00000000 0      
111    SLINE  0      30     00000006 0      
....
154    GSYM   0      51     00000000 2988   panicstr:G(0,19)
155    SO     0      0      f01001ac 0      
156    SO     0      2      f01001ac 31     /home/sunus/myProjects/6828/jos/
157    SO     0      2      f01001ac 3005   kern/console.c

可以看到, 68行正是当前eip的所在文件. 并且155也是当前文件结束的范围. 即文件名搜索是 正确 的. 接下来搜索 函数信息

// Search within that file's stabs for the function definition
// (N_FUN).
lfun = lfile;
rfun = rfile;
stab_binsearch(stabs, &lfun, &rfun, N_FUN, addr);
#lfun == 107, rfun == 108

然后, 检查下表中的107项是什么?

1
106    FUN    0      0      00000060 0      

嗯? 这是什么东西? 我不知道.. 但是, 对stabs信息的查看, 得到stabs有这么一条:

1
96     FUN    0      12     f0100040 2835   test_backtrace:F(0,18)

很明显, 这条才是我们需要的. 我的猜想可能是 编译器 或者某些编译选项造成的, 但是我搜索, 调查之后没结果. 并且类似106行这样的entry, 出现得很有规律.

grep -rn 'N_FUN' stabs

stabs:105:96     FUN    0      12     f0100040 2835   test_backtrace:F(0,18)
stabs:115:106    FUN    0      0      00000060 0
stabs:117:108    FUN    0      23     f01000a0 2867   i386_init:F(0,18)
stabs:125:116    FUN    0      0      0000005f 0
stabs:127:118    FUN    0      58     f01000ff 2885   _panic:F(0,18)
stabs:145:136    FUN    0      0      00000067 0
stabs:147:138    FUN    0      83     f0100166 2961   _warn:F(0,18)
stabs:161:152    FUN    0      0      00000046 0
stabs:195:186    FUN    0      16     f01001ac 3047   delay:f(0,18)
stabs:202:193    FUN    0      0      00000048 0
stabs:204:195    FUN    0      51     f01001f4 3061   serial_proc_data:f(0,1)
stabs:219:210    FUN    0      0      00000044 0
stabs:221:212    FUN    0      59     f0100238 3085   serial_intr:F(0,18)
stabs:227:218    FUN    0      0      0000001f 0
stabs:229:220    FUN    0      66     f0100257 3105   serial_putc:f(0,18)
stabs:251:242    FUN    0      0      00000059 0
stabs:253:244    FUN    0      79     f01002b0 3142   serial_init:f(0,18)
stabs:266:257    FUN    0      0      000000cb 0
stabs:268:259    FUN    0      112    f010037b 3162   lpt_putc:f(0,18)
stabs:289:280    FUN    0      0      00000079 0
stabs:291:282    FUN    0      133    f01003f4 3179   cga_init:f(0,18)
stabs:330:321    FUN    0      0      000000cc 0
stabs:332:323    FUN    0      163    f01004c0 3241   cga_putc:f(0,18)
stabs:381:372    FUN    0      0      00000212 0
stabs:383:374    FUN    0      316    f01006d2 3258   kbd_proc_data:f(0,1)
stabs:433:424    FUN    0      0      00000189 0
stabs:435:426    FUN    0      364    f010085b 3311   kbd_intr:F(0,18)
stabs:440:431    FUN    0      0      00000014 0
stabs:442:433    FUN    0      370    f010086f 3328   kbd_init:f(0,18)
stabs:446:437    FUN    0      0      00000005 0
stabs:448:439    FUN    0      392    f0100874 3345   cons_intr:f(0,18)
stabs:463:454    FUN    0      0      0000004d 0
stabs:465:456    FUN    0      407    f01008c1 3391   cons_getc:F(0,1)
stabs:480:471    FUN    0      0      0000005c 0
stabs:482:473    FUN    0      429    f010091d 3408   cons_putc:f(0,18)
stabs:490:481    FUN    0      0      00000029 0
stabs:492:483    FUN    0      438    f0100946 3426   cons_init:F(0,18)

至于为何会出现106行那样的entry, 我至今没有结果. 希望知道的朋友可以留言. 但是, 我写了一个简单的 stabs_fix 去除那些我暂时认为是无效的条目. 即在 debuginfo_eip 开始阶段手动标记那些条目, 这样在 stab_binsearch 的时候则不会采用他们.

// add a quick & dirct fix to skip stabs N_FUN entry with n_value below KERNBASE

#define N_INVAILD_FUN 0x73
static void stabs_fix()
{
    const struct Stab *stabs, *stab_end;
    stabs = __STAB_BEGIN__;
    stab_end = __STAB_END__;
    static int is_fixed = 0;
    int i = 0;
    uint8_t *p_fix;
    if(is_fixed)
        return ;
    for(; i < stab_end - stabs; i++) {
        if ((stabs[i].n_type == N_FUN) && (stabs[i].n_value < KERNBASE)) {
            p_fix = (uint8_t *)&stabs[i].n_type;
            *p_fix = N_INVAILD_FUN;
        }
    }
    cprintf("stabs fixed!\n");
    is_fixed = 1;
}

于是, 问题都解决了. 完整代码可以见github

##UPDATE##

  • 问题已经解决, 是gcc的编译参数
    1
    
    -g
    1
    
    -gstabs
    一起使用引起的.
  • 1
    
    -g
    添加的额外的调试信息正是那些看似非法的stabs entry.

###Exercise #8

We have omitted a small fragment of code - the code necessary to print octal numbers using patterns of the form “%o”. Find and fill in this code fragment.

代码比较简单, 在

1
lib/printfmt.c:void vprintfmt(void (*putch)(int, void*), void *putdat, const char *fmt, va_list ap)
添加如下代码

case 'o':
    // Replace this with your code.
    num = getuint(&ap, lflag);
    base = 8;
    goto number;

Be able to answer the following questions:

Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?

系统在console打印出字符的流程如下:

  1. 用户调用
    1
    
    kern:printf.c int cprintf(const char *fmt, ...)
  2. 1
    
    cprintf
    会提取出可变参数列表, 并且传递给
    1
    
    kern:printf.c int vcprintf(const char *fmt, va_list ap)
  3. 1
    
    vcprintf
    之后会调用
    1
    
    lib/printfmt.c void vprintfmt(void (*putch)(int, void*), void *putdat, const char *fmt, va_list ap)
  4. 1
    
    vprintfmt
    会使用vcprintf传递的参数
    1
    
    putch
    函数来打印字符, 另外主要的逻辑(扫描字符串)也是在
    1
    
    vprintfmt
    完成.
  5. 1
    
    vprintfmt
    也会将成功打印的字符数通过参数
    1
    
    putdat
    返回给他的调用者
    1
    
    vcprintf
    ,
    1
    
    vcprintf
    也会返回成功打印的字符数给
    1
    
    cprintf
    的调用者.
  6. 1
    
    putch
    则是负责打印单个字符的函数.

Explain the following from console.c:

if (crt_pos >= CRT_SIZE) {
        int i;
        memcpy(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
        for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
                crt_buf[i] = 0x0700 | ' ';
        crt_pos -= CRT_COLS;
}

有宏

1
#define CRT_SIZE (CRT_ROWS * CRT_COLS)
,
1
CRT_SIZE
为一屏幕最大显示的字符数, 可以看出, 该代码的左右就是当显示满一屏幕之后, 换屏幕, 并在新的屏幕继续输出字符.

  • For the following questions you might wish to consult the notes for Lecture 2. These notes cover GCC’s calling convention on the x86.

Trace the execution of the following code step-by-step:

int x = 1, y = 3, z = 4;
cprintf("x %d, y %x, z %d\n", x, y, z);
  • In the call to cprintf(), to what does fmt point? To what does ap point?

    That’s Too Easy.

  • List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.

    上文已经有说.

Run the following code.

unsigned int i = 0x00646c72;
cprintf("H%x Wo%s", 57616, &i);
  • What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise. Here’s an ASCII table that maps bytes to characters.
  • The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

    输出为

    1
    
    He110 World

    • e110好理解, 是57616的 hex 表示. 那么 rld 是从哪儿来的呢?
    • x86小端(little-endian)模式下, 高位在低地址存放. 假设局部变量i在栈中的地址为[addr, addr + 3]
    • 0x00(最高位)的地址为addr + 3, 0x72(最低位)的地址则是 addr + 0
    • 所以
      1
      
      printf("%s", &i);
      则打印的是rld’\0’
  • In the following code, what is going to be printed after ‘y=’? (note: the answer is not a specific value.) Why does this happen?

    1
    
      cprintf("x=%d y=%d", 3);
    

    Too Easy.

  • Let’s say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?

    暂时没想法:( 欢迎讨论

##The Stack In the final exercise of this lab, we will explore in more detail the way the C language uses the stack on the x86, and in the process write a useful new kernel monitor function that prints a backtrace of the stack: a list of the saved Instruction Pointer (IP) values from the nested call instructions that led to the current point of execution.

Exercise 9. Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which “end” of this reserved area is the stack pointer initialized to point to?

阅读

1
kern/entry.S
的最后几行代码

.data
###################################################################
# boot stack
###################################################################
	.p2align	PGSHIFT		# force page alignment
	.globl		bootstack
bootstack:
	.space		KSTKSIZE
	.globl		bootstacktop   
bootstacktop:

注意到

1
.data .space .globl
三个指令.他们的用处是:

  • 1
    
    .data
    .data tells as to assemble the following statements onto the end of the data subsection numbered subsection (which is an absolute expression). If subsection is omitted, it defaults to zero.
  • 1
    
    .space
    This directive emits size bytes, each of value fill. Both size and fill are absolute expressions. If the comma and fill are omitted, fill is assumed to be zero. This is the same as `.skip’.
  • 1
    
    .globl
    .global makes the symbol visible to ld. If you define symbol in your partial program, its value is made available to other partial programs that are linked with it. Otherwise, symbol takes its attributes from a symbol of the same name from another file linked into the same program.
  • .p2align[wl] abs-expr, abs-expr, abs-expr
    • Pad the location counter (in the current subsection) to a particular storage boundary. The first expression (which must be absolute) is the number of low-order zero bits the location counter must have after advancement. For example `.p2align 3’ advances the location counter until it a multiple of 8. If the location counter is already a multiple of 8, no change is needed.

    • The second expression (also absolute) gives the fill value to be stored in the padding bytes. It (and the comma) may be omitted. If it is omitted, the padding bytes are normally zero. However, on some systems, if the section is marked as containing code and the fill value is omitted, the space is filled with no-op instructions.

    • The third expression is also absolute, and is also optional. If it is present, it is the maximum number of bytes that should be skipped by this alignment directive. If doing the alignment would require skipping more bytes than the specified maximum, then the alignment is not done at all. You can omit the fill value (the second argument) entirely by simply using two commas after the required alignment; this can be useful if you want the alignment to be filled with no-op instructions when appropriate.

  • GNU Assembler(GAS) 完整的指令列表可以在这查看Assembler Directives
  • 有了这些指令, 那这段汇报代码就比较好理解了.
    1. 在当前的.data段, 开一新的页(PGSHIFT = 12) 也就是在2^12 = 4k处对齐.
    2. 1
      
      bootstack
      指向该地址的起始位置.
    3. 栈顶(
      1
      
      bootstacktop
      ) 则是
      1
      
      bootstack + KSTKSIZE
      (通过
      1
      
      .space
      来分配空间)
    4. 1
      
      KSTKSIZE
      的大小为
      1
      
      8 * PGSIZE = 32kb
    5. 栈顶的地址是
      1
      
      0xf0110000
  • 验证, 将
    1
    
    .space KSTKSIZE
    改为
    1
    
    .space KSTKSIZE 0x1
    可以验证. 然后在
    1
    
    kern:entry.S
    第77行下断点.gdb输出为:
(gdb) p/x bootstacktop
$4 = 0x111021
(gdb) x/10x (0x110000-4096*8-3)
    0x107ffd:   0x01000000  0x01010101  0x01010101  0x01010101
    0x10800d:   0x01010101  0x01010101  0x01010101  0x01010101
    0x10801d:   0x01010101  0x01010101
  • 具体有几字节对不上, 可能是因为编译器增加了一些guard之类的举措. 但是结果基本是符合我们结论的.

###Exercise #11

Implement the backtrace function as specified above. Use the same format as in the example, since otherwise the grading script will be confused. When you think you have it working right, run make grade to see if its output conforms to what our grading script expects, and fix it if it doesn’t. After you have handed in your Lab 1 code, you are welcome to change the output format of the backtrace function any way you like.

实现

1
kern/monitor.c:mon_backtrace
函数.输出的格式如下所示:

Stack backtrace:
ebp f0109e58  eip f0100a62  args 00000001 f0109e80 f0109e98 f0100ed2 00000031
ebp f0109ed8  eip f01000d6  args 00000000 00000000 f0100058 f0109f28 00000061

栈调用的结构应该都非常熟悉了. 并且在

1
inc/x86.h
里有非常方便的帮助函数
1
read_ebp()

获得寄存器

1
ebp
的值之后, 我们可以得到如下几个信息(ebp为unsigned int):

  1. 当前frame的返回指位于地址
    1
    
    ebp+4
    处, 也就是
    1
    
    eip
  2. 当前函数scope内, 第n个参数的地址为
    1
    
    ebp+4(n+1)
  3. 内存
    1
    
    ebp
    处的值就是
    1
    
    调用者(caller)的ebp
    , 然后这些ebp就可以连起来了.

具体编码需要注意的还是c的老问题

uint32_t ui = 0x00000000;
uint32_t *uip = 0x00000000;
ui += 1;  //ui = 0x00000001
uip += 1; //uip = 0x00000004

这个清楚之后, 看代码

1
kern/monitor.c

int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
	// Your code here.
	volatile uint32_t ebp;
	volatile uint32_t *p;
	struct Eipdebuginfo info;
	ebp = read_ebp();
	while (ebp > KSTACKTOP) {
		p = (uint32_t *)ebp;
		cprintf("ebp %08x eip %08x args %08x %08x %08x %08x\n",
                p, *(p+1), *(p+2), *(p+3), *(p+4), *(p+5), *(p+6));
		ebp = *p;
	}
	return 0;
}

##基本信息

  • 性别: 男
  • 年龄: 23
  • 籍贯: 广西
  • 现居地: 天津
  • 希望工作地点: 不限

##联系方式

  • E-mail: <sunuslee [at] gmail [dot] com>

##教育信息

  • 学校: 天津工业大学-软件工程 2009~2014(中途休学一年)
  • 英语: CET-6

##工作经历

  • 2012.6 ~ 2013.1(实习)
    • 北京云基地, Linux开发工程师
  • 2013.2 ~ Now
    • 天津工业大学, 云计算实验小组, Leader.

##近期项目经历

  1. Ovirtnode-Config-Setup
    • 地址https://github.com/sunuslee/ocsetup
    • 在云基地做的第一个项目, 用Python重写最初Redhat版本的Ovirenode-config-setup.为Ovirt的服务器端编写管理配置程序. 主要用于系统管理员. 主要使用Python, PyGtk.
  2. Find-the-one
  3. Vim-Plugins(根据自己日常需要, 写的一些Vim插件, 已发布)

    1. ColorSchemePicker
    2. show-git-log
  4. MIT6.828 JOS COURSE
    • 课程地址 http://pdos.csail.mit.edu/6.828/2010
    • 源码地址 https://github.com/sunuslee/sunus_jos
    • 2010年下半年至2011年上半年,与Chinaunix网友合作学习MIT6.828:Operating System Engineering 2010 FALL课程 完成至LAB6.(总共有7个LAB) 实现了包括jos系统:启动,内存管理,中断与异常处理,进程调度,进程间通信,文件系统,部分I/O,网络驱动(正在进行中),及一些库函数的相关代码(C + 部分汇编)。该学习经历让我对LINUX内核有了一定的了解,及在学习中提高了了编码,调试的能力。
  5. 天津工业大学 云计算(OpenStack) 兴趣小组
    • 地址 http://59.67.107.57/
    • (2013.3~Now)担任Leader
    • 搭建开发环境, 上线环境. 配置服务器, 建立团队wiki并且编写了wiki文章帮助组员学习.
    • 对OpenStack各个组件有一定的了解.
    • 招募新人.
    • 组内各种杂事.
  6. 还有其他的一些项目比较久远. 代码基本都在https://github/sunuslee

##专业技能

  • Linux

    09年接触Linux, 11年开始全面在Linux环境下工作/生活. 非常熟悉Linux环境下编码,日常需要用到的各种工具. 对各主流发行版有一定的了解和使用经验(RedHat/Debain/Arch). 能够按照自己的需求编写合适的脚本(shell/python). 对Linux下的服务器, 网络配置,故障诊断, 环境搭建/配置 有一定的经验. 当前使用的发行版是ArchLinux & OS X

  • 编程语言

    1. C, 于09年接触的第一门语言. 09~12年使用得比较多. 11年之后完全在Linux环境下编写C代码, 对C的各个方面均有一定的了解.

    2. Python, 11年开始接触的第二门语言. 至今仍使用比较频繁,对Python的基本库,模块均比较熟悉. 对Python的网络框架Django, flask有过实际的项目经历. 能够使用Python完成自己的需求, 写出符合PEP-8规范, Pythonic的代码.

    3. Ruby, 12年接触的第三门语言. 能够使用ruby完成自己的需求. 有Ruby on Rails开发经验. 正在使用Ruby on Rails开发Web/Mobile App

    4. Javascript, 在使用 Ruby on Rails时一同学习的语言.对Javascript自身,以及一些常用框架(JQuery,JQueryMobile)有开发经验.

    5. Shell, 能够用shell完成自己的需求.(同步blog站点, 备份, linux自身的一些自动化, 便利脚本)

    6. SQL, 基础的数据库设计, SQL语句知识.

其他

  1. 自学能力强, 对新技术充满兴趣.
  2. 凡是使用到的编程语言, 均有良好的编码风格/习惯.
  3. 完全无障碍地阅读/编写英文文档.
  4. 能够在技术社区http://stackoverflow.com , http://github.com等获取需要的信息/资源.
  5. 热爱开源 & Linux

SNS信息

  1. Github 里面有我绝大部分代码: https://github.com/sunuslee
  2. 技术博客(新): http://sunuslee.github.io
  3. Gitcafe 有我少部分代码 https://gitcafe.com/sunuslee
  4. 知乎, 有部分技术方面的回答 http://www.zhihu.com/people/sunus
  5. Blog, 有极少的技术方面的文章 http://sunus.me