语法 规则 1 2 3 4 5 6 7 8 9 10 rule dummy // rule前面加上global/private可表示全局/ 私有规则 { condition: false } rule TagsExample1 : Foo Bar Baz // 添加标签 { ... }
1 2 3 4 meta: my_identifier_1 = "Some string data" my_identifier_2 = 24 my_identifier_3 = true
1 2 3 4 5 // ... and this is single-line comment
字符串 | strings 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 strings: // 十六进制字符串 | hexadecimal strings $hex_string_01 = { E2 34 ?? C8 A? FB } $hex_string_02 = { F4 23 [4 -6 ] 62 B4 } // 中间包含 4 -6 bytes $hex_string_03 = { F4 23 ( 62 B4 | 56 ) 45 } // 包含 F42362B445 / F4235645 // 文本字符串 | text strings $text_string_01 = "foobar" $text_string_02 = "foobar" nocase // 忽略大小写 $text_string_03 = "foobar" fullword // 完全匹配foobar,前后没有字母/数字 // 宽字符字符串 | wide-character strings $wide_string = "Borland" wide $wide_and_ascii_string = "Borland" wide ascii // XOR strings $xor_string_01 = "This program cannot" xor $xor_string_02 = "This program cannot" xor(0 x01-0 xff) $xor_string_03 = "This program cannot" xor wide ascii // 正则表达式 | regular expressions $re1 = /md5: [0-9a-fA-F]{32}/ $re2 = /state: (on|off)/ // 私有字符串 | private strings $private_string = "foobar" private // 懒得写字符串名称时 $ = "lazycatz"
条件 | conditions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 condition: ($a or $b ) and ($c or $d ) $a at 100 and $b at 200 // 表示a出现在偏移100 的位置,b出现在偏移200 的位置(文件地址/虚拟地址,十进制) $a in (0 ..100 ) and $b in (100 ..filesize) // a出现在偏移0 -100 的位置。b出现在偏移100 的位置到文件末尾 // 通过@a[i]可以获取a的第i次出现时的地址 // 通过!a[i]可以获取d的第i次出现时的长度 filesize > 200 KB // 文件大小大于200 KB // entrypoint已弃用,需从pe.entry_point中调用 // 获取某个地址中的数据 // 默认小端序,支持 u?int(8 |16 |32 )(be)? 类型 // MZ signature at offset 0 and ... uint16(0 ) == 0 x5A4D and // ... PE signature at offset stored in MZ header at 0 x3C uint32(uint32(0 x3C)) == 0 x00004550 // 字符串集合 2 of ($a ,$b ,$c ) // 至少出现其中两个字符串 all of them // 出现所有字符串 any of them // 出现任意一个字符串 all of ($a *) // 出现所有$a 开头的字符串 any of ($a ,$b ,$c ) // 出现$a , $b , $c 中任意一个 1 of ($*) // $* 表示所有字符串,1 of ($*)等同any of them // for expression of string_set : ( boolean_expression ) for all of them : ( for all of ($a *) : ( @ > @b ) // 所有$a 开头的字符串的地址要大于字符串$b 的地址 // for expression identifier in indexes : ( boolean_expression ) for all i in (1 ..3 ) : ( @a[i] + 10 == @b[i] ) // i从1 到3 for all i in (1 .. $a and Rule1 // 引用Rule1
使用模块 | using modules 1 2 3 4 5 6 7 8 9 10 import "pe "rule Test { strings : $a = "some string" condition: $a and pe.entry_point == 0 x1000 }
外部变量 | external variables 1 2 3 4 5 condition: bool_ext_var or filesize < int_ext_var string_ext_var_01 contains "text" string_ext_var_02 matches /[a-z]+/ // /[a-z]+/ 后加i或s表示忽略大小写/单行识别 // 外部变量需在命令行中给出
文件包含 | including files 1 include "./includes/other.yar"
yara特征提取 yarGen 环境要求:4GB RAM / 8GB RAM(使用—opcodes分析操作码)
依赖包安装
1 sudo pip install pefile scandir lxml naiveBayesClassifier
数据库下载更新
1 python yarGen.py --update
数据库下载地址
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 https:// www.bsk-consulting.de/yargen/g ood-exports-part1.db https:// www.bsk-consulting.de/yargen/g ood-exports-part2.db https:// www.bsk-consulting.de/yargen/g ood-exports-part3.db https:// www.bsk-consulting.de/yargen/g ood-exports-part4.db https:// www.bsk-consulting.de/yargen/g ood-exports-part5.db https:// www.bsk-consulting.de/yargen/g ood-exports-part6.db https:// www.bsk-consulting.de/yargen/g ood-exports-part7.db https:// www.bsk-consulting.de/yargen/g ood-exports-part8.db https:// www.bsk-consulting.de/yargen/g ood-exports-part9.db https:// www.bsk-consulting.de/yargen/g ood-imphashes-part1.db https:// www.bsk-consulting.de/yargen/g ood-imphashes-part2.db https:// www.bsk-consulting.de/yargen/g ood-imphashes-part3.db https:// www.bsk-consulting.de/yargen/g ood-imphashes-part4.db https:// www.bsk-consulting.de/yargen/g ood-imphashes-part5.db https:// www.bsk-consulting.de/yargen/g ood-imphashes-part6.db https:// www.bsk-consulting.de/yargen/g ood-imphashes-part7.db https:// www.bsk-consulting.de/yargen/g ood-imphashes-part8.db https:// www.bsk-consulting.de/yargen/g ood-imphashes-part9.db https:// www.bsk-consulting.de/yargen/g ood-opcodes-part1.db https:// www.bsk-consulting.de/yargen/g ood-opcodes-part2.db https:// www.bsk-consulting.de/yargen/g ood-opcodes-part3.db https:// www.bsk-consulting.de/yargen/g ood-opcodes-part4.db https:// www.bsk-consulting.de/yargen/g ood-opcodes-part5.db https:// www.bsk-consulting.de/yargen/g ood-opcodes-part6.db https:// www.bsk-consulting.de/yargen/g ood-opcodes-part7.db https:// www.bsk-consulting.de/yargen/g ood-opcodes-part8.db https:// www.bsk-consulting.de/yargen/g ood-opcodes-part9.db https:// www.bsk-consulting.de/yargen/g ood-strings-part1.db https:// www.bsk-consulting.de/yargen/g ood-strings-part2.db https:// www.bsk-consulting.de/yargen/g ood-strings-part3.db https:// www.bsk-consulting.de/yargen/g ood-strings-part4.db https:// www.bsk-consulting.de/yargen/g ood-strings-part5.db https:// www.bsk-consulting.de/yargen/g ood-strings-part6.db https:// www.bsk-consulting.de/yargen/g ood-strings-part7.db https:// www.bsk-consulting.de/yargen/g ood-strings-part8.db https:// www.bsk-consulting.de/yargen/g ood-strings-part9.db
yara特征提取
1 python -m <dir> -o <output file>
references