Title

# Happy Monad Fix _輕便_ 寫出 multi-pass compiler

### 2013.10.8
## CindyLinz

---

# 警語

---

+ 接下來將會看到很多程式碼

+ 雖然這次主題跟 coding style 無關，還是非常歡迎批評指教 coding style <br>
  我一直還在摸索 Haskell 程式碼要怎麼排版怎麼寫可以看起來更順眼

---

# 目的語言

---

### GAS (GNU AS)

想說常常都是 C--，這次直接用組合語言，看會不會比較有 fu~ ^^

特別感謝 c9s 的 GAS 介紹文 :D http://c9s.blogspot.tw/2008/06/linux-gnu-as-1.html <br>
我完全靠他教的 :p

```
.section .data
output:
    .ascii "The processor vendor id is 'xxxxxxxxxxxx''\n"
.section .text
.global _start
_start:
    movl $0,%eax
    cpuid
    
    movl $output,%edi
    movl %ebx, 28(%edi)
    movl %edx, 32(%edi)
    movl %ecx, 36(%edi)
    
    movl $4, %eax
    movl $1, %ebx
    movl $output, %ecx
    movl $42, %edx
    int $0x80
    
    movl $1,%eax
    movl $0,%ebx
    int $0x80
```

---

# 測試語言

---

### double.code
```
function double {
    arg + arg
}

output(double(input(0)))
```

---

### fib.code
```
function fib {
    if( arg <= 1 ) {
        1
    } else {
        fib(arg-1) + fib(arg-2)
    }
}

output(fib(input(0)))
```

---

### fib2.code
```
function fiba {
    if( arg <= 1 ) {
        1
    } else {
        fibb(arg-1) + fibb(arg-2)
    }
}

function fibb {
    if( arg <= 1 ) {
        3
    } else {
        fiba(arg-1) + fiba(arg-2)
    }
}

output(fiba(input(0)))
```

---

### balance.code
```
function main {
    function positive {
        function succ {
            arg + 1
        }

if( arg <= 1 ) {
            1000
        } else {
            succ(positive(arg-1))
        }
    }

function negative {
        function succ {
            arg - 1
        }

if( arg <= 1 ) {
            1000
        } else {
            succ(negative(arg-1))
        }
    }

positive(arg) + negative(arg)
}

output(main(input(0)))
```

---

# 「輕便」的意思

---

寫簡易的 compiler 時，我喜歡直接在 syntax rule 右邊的 reduce action 直接寫產生結果的程式，
而不另外建立完整複雜的 AST (abstract syntax tree) 資料結構
```
program : rule1 rule2 { ... gen code ... }

rule1 : XX { ... gen code ... }
      | OO { ... gen code ... }

rule2 : YY { ... gen code ... }
```

---

# 測試環境

我使用的環境:

+ Linux 3.8.30-30-generic #44-Ubuntu SMP
+ ghc 7.6.2
+ happy 1.18.10
+ as 2.23.2
+ ld 2.23.2

因為我的環境是 x86\_64, 所以我生出來的 .s 檔裡面都是用 64bits 用的指令

---

# 檔案說明

---

### test.hs
- 測試用 script
- 產生測試 script 執行檔
  + ghc --make test
- 用法:
  + ./test {h1,h2,h3,h3-1,h3-2,h4,h4-1,h4-2,h5,h5-1} {double,fib,fib2,balance}
- 可以參考這個檔的內容來得到手動編譯的時候要怎麼下指令

---

### \*.code
- 測試語言寫的測試用程式

---

### h1.y
- 簡陋的 LALR parser
- 可以 compile double.code

### h2.y
- 加上一個 LALR parser 常用的 trick 來處理呼叫自己的遞迴函數
- 可以 compile double.code, fib.code

### amb.y
- 用來展示 h2.y 所用的 trick 會降低 parser 的分辨力

---

# MonadFix

---

```haskell
class Monad m => MonadFix m where
  mfix :: (a -> m a) -> m a
```

```haskell
instance MonadFix Maybe where
  mfix f = let a = f (unJust a) in a
    where
      unJust (Just x) = x
      unJust Nothing  = error "mfix Maybe: Nothing"

fix :: (a -> a) -> a
fix f = let x = f x in x

instance MonadFix [] where
  mfix f = case fix (f . head) of
    []    -> []
    (x:_) -> x : mfix (tail . f)

instance (MonadFix m) => MonadFix (StateT s m) where
  mfix f = StateT $ \s -> mfix $ \ ~(a, _) -> runStateT (f a) s
```

---

# 檔案說明

---

### h3.y
- 使用 MonadFix 來處理 mutual execlusive recursive
- 可以 compile double.code, fib.code, fib2.code

### h3-1.y
- 既然使用了 MonadFix, 已不需要 h2.y 所使用的 trick,
  去掉 h2.y 所用的 trick
- 可以 compile double.code, fib.code, fib2.code

### h3-2.y
- 使用 GHC 的 DoRec extension 語法來使用 MonadFix (這才是現在的主流寫法)
- 可以 compile double.code, fib.code, fib2.code

---

### h4.y
- 不使用 Happy 的 Monad 模式，只使用 Happy 的 pure 模式，
  而我們自己的方式來使用 Monad。

這樣可以使 Happy 的 reduce 順序生出來的只是一個個 monad action，
  而 monad 間 binding 的順序由我們自己來控制。

(Monad 模式 Happy 會固定以 post order 走訪的順序，
   走到每一個 node 的 action 時，都是 children node action 全部都已 bind 完之後)
- 可以 compile double.code, fib.code, fib2.code

### h4-1.y
- 直接在自己放的 Monad 裡面一路印出結果,
  但這相當於強迫它在第一個 pass 就要得出結果
- compile 任何檔案都會卡住

---

### h4-2.y
- 使用我們自己的 binding 順序，
  就可以做出一個一般不容易「輕便做出」的功能：
  生出的組語中用到的 label 就是行號。

這功能是用來模擬說直接生出寫死 address offset 的機器指令的感覺，
  因為普通生成組合語言，組譯成 .o 檔以後，還是會再過一層 ld 生成執行檔，
  所以表面上 one-pass，還是隱隱偷用 ld 作它的第二個 pass，
  而直接生出行號 label 的作法，就沒有這個取巧空間了。
- 可以 compile double.code, fib.code, fib2.code

---

### h5.y
- 改變 syntax rule, 準備接受 inner function<br>
  (這邊只為展示 symbol table / scope 的處理，
   所以 inner function 只是名字藏起來的 function，
   沒有 closure 的特性)

- 可以 compile double.code, fib.code, fib2.code
  (balance.code 不會有 syntax error，但是 symbol table 尚未處理，會有 redefined error)

### h5-1.y
- 處理 inner function 的 symbol table / scope
- 可以 compile double.code, fib.code, fib2.code, balance.code

---

# 心得

---

### Happy 是一個普通標準制式的 LALR parser (就像 yacc 一樣)

但當它配上 Haskell 作為後盾，故事才開始不一樣 <br>
.small[(偷偷暫時忽略掉 Happy 的 Attribute grammars & Generalized LR parsing)]

---

### 拋開程式順序=執行順序的包袱

如果心中還隱隱想著先出現的程式碼會先執行，讀 Haskell 的程式偶爾就會感到很意外很困惑

拋開之後，自己寫程式時就可以自由依照自己偏好的方式，例如說程式的相關性來組織程式

---

### 不要問你了解 Monad 是什麼．只要問 Monad 為你解了什麼

曾經我對 Monad do-notation 程式的理解是：循序執行的指令

後來看到 MonadFix，卻是一個活生生的反例... :Q

---

我好希望能為 Monad 找到熟悉具體而又貼切的隱喻

但是

+ 熟悉 具體 → 不貼切
+ 貼切 熟悉 → 不具體
+ 具體 貼切 → 不熟悉

Monad 就是 Monad，不是棋盤，也不是稿紙，更不是綠豆糕.. orz

---

我現在傾向放棄尋找 Monad 的「一般物理意義」

把力氣放在

+ 探索 Monad 可以怎麼用
+ 觀察別人應用 Monad 時
  - 其解法的語法結構
  - Monad 此時扮演的角色 (物理意義)

就像從前第一次看到 while loop 的時候一樣

---

停止焦慮．海闊天空 :D

scm 老師說：

+ 即使不會 Category Theory，Functional Programming 也可以學得很好
+ 即使 Functional Programming 學得很好，可能還是不會 Category Theory

我猜想：

+ 即使說不出 Monad 是什麼，Haskell 還是可以寫得很好
+ 即使 Haskell 寫得很好，可能還是說不出 Monad 是什麼