Uploaded image for project: 'VOLTHA'
  1. VOLTHA
  2. VOL-2845

Openolt Adapter crashes after running for an extended period of time

    XMLWordPrintable

    Details

      Description

      Running a k8s hardware test over the weekend revealed openolt adapter has been crashing a number of times and been restarted by k8s:

      foundry@lab-kube-01:~$ kubectl get pods
      NAME                                                              READY   STATUS    RESTARTS   AGE
      adapter-open-olt-68448fcbdf-ggkmf                                 1/1     Running   46         3d20h
      adapter-open-onu-7945df5fb-djm45                                  1/1     Running   0          3d16h
      cord-kafka-0                                                      1/1     Running   3          4d22h
      cord-kafka-1                                                      1/1     Running   0          4d22h
      cord-kafka-2                                                      1/1     Running   0          4d22h
      cord-kafka-zookeeper-0                                            1/1     Running   0          4d22h
      cord-kafka-zookeeper-1                                            1/1     Running   0          4d22h
      cord-kafka-zookeeper-2                                            1/1     Running   0          4d22h
      etcd-operator-etcd-operator-etcd-backup-operator-88d6bc55cmxlz8   1/1     Running   0          4d22h
      etcd-operator-etcd-operator-etcd-operator-56c55d965f-66zqx        1/1     Running   0          4d22h
      etcd-operator-etcd-operator-etcd-restore-operator-55f6ccbfzdlp6   1/1     Running   0          4d22h
      onos-bf869dcb7-ldw9w                                              1/1     Running   0          3d20h
      voltha-etcd-cluster-6rttskxvj4                                    1/1     Running   0          4d22h
      voltha-etcd-cluster-gp6t7hnq4z                                    1/1     Running   0          4d22h
      voltha-etcd-cluster-vgsbl75q6l                                    1/1     Running   0          4d22h
      voltha-ofagent-64fd9f6446-sdqth                                   1/1     Running   0          3d20h
      voltha-rw-core-56cdd9bcc9-9zs9x                                   1/1     Running   0          3d20h
      

      The cause seems to be a concurrency issue, as the --previous log from k8s shows the panic:

      atal error: concurrent map read and map write
      
      goroutine 23399 [running]:
      runtime.throw(0xfaec9f, 0x21)
              /usr/local/go/src/runtime/panic.go:774 +0x72 fp=0xc00077def0 sp=0xc00077dec0 pc=0x4397b2
      runtime.mapaccess1_fast32(0xe15ca0, 0xc0001cb140, 0xc000000006, 0xc0003085e0)
              /usr/local/go/src/runtime/map_fast32.go:21 +0x1a4 fp=0xc00077df18 sp=0xc00077def0 pc=0x41abc4
      github.com/opencord/voltha-openolt-adapter/internal/pkg/core.(*OpenOltStatisticsMgr).PortsStatisticsKpis(0xc0003bf460, 0xc000033860, 0x10)
              /go/src/github.com/opencord/voltha-openolt-adapter/internal/pkg/core/statsmanager.go:460 +0x195 fp=0xc00077df80 sp=0xc00077df18 pc=0xc85d85
      github.com/opencord/voltha-openolt-adapter/internal/pkg/core.(*OpenOltStatisticsMgr).PortStatisticsIndication(0xc0003bf460, 0xc000033860, 0x10)
              /go/src/github.com/opencord/voltha-openolt-adapter/internal/pkg/core/statsmanager.go:389 +0xba fp=0xc00077dfc8 sp=0xc00077df80 pc=0xc85b1a
      runtime.goexit()
              /usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc00077dfd0 sp=0xc00077dfc8 pc=0x466911
      created by github.com/opencord/voltha-openolt-adapter/internal/pkg/core.(*DeviceHandler).handleIndication
              /go/src/github.com/opencord/voltha-openolt-adapter/internal/pkg/core/device_handler.go:521 +0xdb2
      

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

            Assignee:
            ggowdra Girish Gowdra
            Reporter:
            mjeanneret Matt Jeanneret
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: