oracledb_exporter踩坑记录
需求:
希望能通过oracledb_exporter采集到必要信息(表空间信息,交易统计数据等),并发送短信给开发人员。
踩坑总结:
1、值必须为数值类型!使用sum(xxx)等函数时,可能出现NULL的结果,必须使用nvl函数!
nvl(sum(xxxx),0)
2、metricsdesc展示列必须小写!且不允许有下划线!
labels = [“NAME”,"TOTALBYTE","USED_BYTE"]
metricsdesc ={TOTALBYTE="总空间", USED_BYTE="已用空间"}
改为:
labels = [“name”,"totalbyte","usedbyte"]
metricsdesc ={totalbyte="总空间", usedbyte="已用空间"}
踩坑记录:
1.统计交易信息
统计前一天的交易数量,交易总金额等信息
[[metric]]
context = "testtrans"
labels= ["transdate","count","sumamt"] -- 需要关注的列信息
metricsdess={ count="总笔数", sumamt = "总金额" } -- 必须为数值类型
request ="
select to_char(sysdate-1,'yyyyMMdd') as transdate ,
count(1) as count,
sum(t.amt) as sumamt
from t_test_table t where 1=1 and t.transdate = to_char(sysdate-1,'yyyyMMdd')
"
以上为oracledb_exporter 中 default-metrics.toml 追加的自定义查询,重启之后发现,sumamt一直没有数据,一开始测试的时候由于未造数据,所以sum(t.amt) 数据为 NULL ,造数据之后发现sql执行是有数据的,但是Prometheus页面(ip:端口/metrics)上还是 sumamt="" 无数据。
尝试了很多次都是无数据,最后将sumamt 改名为 totamt 后查出数据。
最后排查的结果是由于一开始未造数据时sum(t.amt) 结果为 NULL ,导致prometheus 将该值存为 NaN,所以之后造数据也无法擦除sumamt 的值,float 类型无法写入当前 NaN 的值内。
启动时的错误日志:
"Unable to convert current value to float (metric= sumamt,metricHelp=总金额,value=<>)" source = "main.go:318"
建议修改为:
[[metric]]
context = "testtrans"
labels= ["transdate","count","sumamt"]
metricsdess={ count="总笔数", sumamt = "总金额" }
request ="
select to_char(sysdate-1,'yyyyMMdd') as transdate ,
count(1) as count,
nvl(sum(t.amt),0) as sumamt -- 使用nvl()函数,当值为NULL时赋值为0
from t_test_table t where 1=1 and t.transdate = to_char(sysdate-1,'yyyyMMdd')
"
若值无法擦除,建议改名后重启oracledb_exporter。
2、查询表空间信息
统计表空间信息,表空间初始内存,表空间已用内存,表空间剩余内存,表空间使用率等信息
[[metric]]
context = "testspace"
labels= ["TB_NAME","TB_TOTAL","TB_USED","TB_RATE"]
metricsdess={ TB_TOTAL="默认内存", TB_USED= "已使用空间" ,TB_RATE=”使用率“}
request ="
select a.tablespace_name as TB_NAME,
nvl(round(a.total_size),0) as TB_TOTAL,
nvl(round(a.total_size)-round(b.free_size), 0) as TB_USED,
nvl((round(a.total_size-b.free_size)/total_size*100),2),0) TB_RATE
from (select tablespace_name, sum(bytes)/1024/1024/1024 as total_size from sys.dba_data_files where AUTOEXTENSIBLE='NO' group by tablespace_name ) a,
(select tablespace_name, sum(bytes)/1024/1024/1024 as free_size from sys.dba_free_space group by tablespace_name) b
where a.tablespace_name = b.tablespace_name(+)
"
错误日志:
"Unable to convert current value to float (metric= TB_TOTAL,metricHelp=默认内存,value=<>)" source = "main.go:318"
"Error scraping for testspace _map[TB_TOTAL:默认内存]:No metrics found while parsing" source = "main.do:266"
错误原因:
改为小写并去掉下划线后,该sql可以执行
[[metric]]
context = "testspace"
labels= ["name","max","used","free","rate"]
metricsdess={ max="默认内存", used= "已使用空间" ,free="剩余可用空间",rate=”使用率“}
request ="
select a.tablespace_name as name,
nvl(round(a.total_size),0) as max,
nvl(round(a.total_size)-round(b.free_size), 0) as used,
nvl(round(b.free_size),0) as free,
nvl((round(a.total_size-b.free_size)/total_size*100),2),0) rate
from (select tablespace_name, sum(bytes)/1024/1024/1024 as total_size from sys.dba_data_files where AUTOEXTENSIBLE='NO' group by tablespace_name ) a,
(select tablespace_name, sum(bytes)/1024/1024/1024 as free_size from sys.dba_free_space group by tablespace_name) b
where a.tablespace_name = b.tablespace_name(+)