Commit f4a66c20 authored by Ashutosh Dixit's avatar Ashutosh Dixit Committed by Greg Kroah-Hartman

misc: mic: Update MIC host daemon with COSM changes

This patch updates the MIC host daemon to work with corresponding
changes in COSM. Other MIC daemon fixes, cleanups and enhancements as
are also rolled into this patch. Changes to MIC sysfs ABI which go
into effect with this patch are also documented.
Reviewed-by: default avatarSudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: default avatarDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: default avatarAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
parent d411e793
...@@ -41,18 +41,15 @@ Description: ...@@ -41,18 +41,15 @@ Description:
When read, this entry provides the current state of an Intel When read, this entry provides the current state of an Intel
MIC device in the context of the card OS. Possible values that MIC device in the context of the card OS. Possible values that
will be read are: will be read are:
"offline" - The MIC device is ready to boot the card OS. On "ready" - The MIC device is ready to boot the card OS. On
reading this entry after an OSPM resume, a "boot" has to be reading this entry after an OSPM resume, a "boot" has to be
written to this entry if the card was previously shutdown written to this entry if the card was previously shutdown
during OSPM suspend. during OSPM suspend.
"online" - The MIC device has initiated booting a card OS. "booting" - The MIC device has initiated booting a card OS.
"online" - The MIC device has completed boot and is online
"shutting_down" - The card OS is shutting down. "shutting_down" - The card OS is shutting down.
"resetting" - A reset has been initiated for the MIC device
"reset_failed" - The MIC device has failed to reset. "reset_failed" - The MIC device has failed to reset.
"suspending" - The MIC device is currently being prepared for
suspend. On reading this entry, a "suspend" has to be written
to the state sysfs entry to ensure the card is shutdown during
OSPM suspend.
"suspended" - The MIC device has been suspended.
When written, this sysfs entry triggers different state change When written, this sysfs entry triggers different state change
operations depending upon the current state of the card OS. operations depending upon the current state of the card OS.
...@@ -62,8 +59,6 @@ Description: ...@@ -62,8 +59,6 @@ Description:
sysfs entries. sysfs entries.
"reset" - Initiates device reset. "reset" - Initiates device reset.
"shutdown" - Initiates card OS shutdown. "shutdown" - Initiates card OS shutdown.
"suspend" - Initiates card OS shutdown and also marks the card
as suspended.
What: /sys/class/mic/mic(x)/shutdown_status What: /sys/class/mic/mic(x)/shutdown_status
Date: October 2013 Date: October 2013
...@@ -126,7 +121,7 @@ Description: ...@@ -126,7 +121,7 @@ Description:
the card. This sysfs entry can be written with the following the card. This sysfs entry can be written with the following
valid strings: valid strings:
a) linux - Boot a Linux image. a) linux - Boot a Linux image.
b) elf - Boot an elf image for flash updates. b) flash - Boot an image for flash updates.
What: /sys/class/mic/mic(x)/log_buf_addr What: /sys/class/mic/mic(x)/log_buf_addr
Date: October 2013 Date: October 2013
...@@ -155,3 +150,17 @@ Description: ...@@ -155,3 +150,17 @@ Description:
daemon to set the log buffer length address. The correct log daemon to set the log buffer length address. The correct log
buffer length address to be written can be found in the buffer length address to be written can be found in the
System.map file of the card OS. System.map file of the card OS.
What: /sys/class/mic/mic(x)/heartbeat_enable
Date: March 2015
KernelVersion: 3.20
Contact: Ashutosh Dixit <ashutosh.dixit@intel.com>
Description:
The MIC drivers detect and inform user space about card crashes
via a heartbeat mechanism (see the description of
shutdown_status above). User space can turn off this
notification by setting heartbeat_enable to 0 and enable it by
setting this entry to 1. If this notification is disabled it is
the responsibility of user space to detect card crashes via
alternative means such as a network ping. This setting is
enabled by default.
...@@ -28,6 +28,10 @@ The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a ...@@ -28,6 +28,10 @@ The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a
low level communications API across PCIe currently implemented for MIC. low level communications API across PCIe currently implemented for MIC.
More details are available at scif_overview.txt. More details are available at scif_overview.txt.
The Coprocessor State Management (COSM) driver on the host allows for
boot, shutdown and reset of Intel MIC devices. It communicates with a COSM
"client" driver on the MIC cards over SCIF to perform these functions.
Here is a block diagram of the various components described above. The Here is a block diagram of the various components described above. The
virtio backends are situated on the host rather than the card given better virtio backends are situated on the host rather than the card given better
single threaded performance for the host compared to MIC, the ability of single threaded performance for the host compared to MIC, the ability of
...@@ -51,19 +55,20 @@ the fact that the virtio block storage backend can only be on the host. ...@@ -51,19 +55,20 @@ the fact that the virtio block storage backend can only be on the host.
| | | Virtio over PCIe IOCTLs | | | | Virtio over PCIe IOCTLs |
| | +--------------------------+ | | +--------------------------+
+-----------+ | | | +-----------+ +-----------+ | | | +-----------+
| MIC DMA | | +----------+ | +-----------+ | | MIC DMA | | MIC DMA | | +------+ | +------+ +------+ | | MIC DMA |
| Driver | | | SCIF | | | SCIF | | | Driver | | Driver | | | SCIF | | | SCIF | | COSM | | | Driver |
+-----------+ | +----------+ | +-----------+ | +-----------+ +-----------+ | +------+ | +------+ +--+---+ | +-----------+
| | | | | | | | | | | | | | |
+---------------+ | +-----+-----+ | +-----+-----+ | +---------------+ +---------------+ | +------+ | +--+---+ +--+---+ | +----------------+
|MIC virtual Bus| | |SCIF HW Bus| | |SCIF HW BUS| | |MIC virtual Bus| |MIC virtual Bus| | |SCIF | | |SCIF | | COSM | | |MIC virtual Bus |
+---------------+ | +-----------+ | +-----+-----+ | +---------------+ +---------------+ | |HW Bus| | |HW Bus| | Bus | | +----------------+
| | | | | | | | | +------+ | +--+---+ +------+ | |
| +--------------+ | | | +---------------+ | | | | | | | | |
| |Intel MIC | | | | |Intel MIC | | | +-----------+---+ | | | +---------------+ |
+---|Card Driver +----+ | | |Host Driver | | | |Intel MIC | | | | |Intel MIC | |
+--------------+ | +----+---------------+-----+ +---|Card Driver | | | | |Host Driver | |
| | | +------------+--------+ | +----+---------------+-----+
| | |
+-------------------------------------------------------------+ +-------------------------------------------------------------+
| | | |
| PCIe Bus | | PCIe Bus |
......
...@@ -119,10 +119,10 @@ stop() ...@@ -119,10 +119,10 @@ stop()
# Wait for the cards to go offline # Wait for the cards to go offline
for f in $sysfs/* for f in $sysfs/*
do do
while [ "`cat $f/state`" != "offline" ] while [ "`cat $f/state`" != "ready" ]
do do
sleep 1 sleep 1
echo -e "Waiting for "`basename $f`" to go offline" echo -e "Waiting for "`basename $f`" to become ready"
done done
done done
......
This diff is collapsed.
...@@ -86,6 +86,7 @@ struct mic_info { ...@@ -86,6 +86,7 @@ struct mic_info {
int id; int id;
char *name; char *name;
pthread_t config_thread; pthread_t config_thread;
pthread_t init_thread;
pid_t pid; pid_t pid;
struct mic_console_info mic_console; struct mic_console_info mic_console;
struct mic_net_info mic_net; struct mic_net_info mic_net;
......
...@@ -2,5 +2,9 @@ ...@@ -2,5 +2,9 @@
# Makefile - Intel MIC Linux driver. # Makefile - Intel MIC Linux driver.
# Copyright(c) 2013, Intel Corporation. # Copyright(c) 2013, Intel Corporation.
# #
obj-$(CONFIG_INTEL_MIC_HOST) += host/
obj-$(CONFIG_INTEL_MIC_CARD) += card/
obj-y += bus/ obj-y += bus/
obj-$(CONFIG_SCIF) += scif/ obj-$(CONFIG_SCIF) += scif/
obj-$(CONFIG_MIC_COSM) += cosm/
obj-$(CONFIG_MIC_COSM) += cosm_client/
...@@ -28,7 +28,6 @@ static ssize_t device_show(struct device *d, ...@@ -28,7 +28,6 @@ static ssize_t device_show(struct device *d,
return sprintf(buf, "0x%04x\n", dev->id.device); return sprintf(buf, "0x%04x\n", dev->id.device);
} }
static DEVICE_ATTR_RO(device); static DEVICE_ATTR_RO(device);
static ssize_t vendor_show(struct device *d, static ssize_t vendor_show(struct device *d,
...@@ -38,7 +37,6 @@ static ssize_t vendor_show(struct device *d, ...@@ -38,7 +37,6 @@ static ssize_t vendor_show(struct device *d,
return sprintf(buf, "0x%04x\n", dev->id.vendor); return sprintf(buf, "0x%04x\n", dev->id.vendor);
} }
static DEVICE_ATTR_RO(vendor); static DEVICE_ATTR_RO(vendor);
static ssize_t modalias_show(struct device *d, static ssize_t modalias_show(struct device *d,
...@@ -49,7 +47,6 @@ static ssize_t modalias_show(struct device *d, ...@@ -49,7 +47,6 @@ static ssize_t modalias_show(struct device *d,
return sprintf(buf, "scif:d%08Xv%08X\n", return sprintf(buf, "scif:d%08Xv%08X\n",
dev->id.device, dev->id.vendor); dev->id.device, dev->id.vendor);
} }
static DEVICE_ATTR_RO(modalias); static DEVICE_ATTR_RO(modalias);
static struct attribute *scif_dev_attrs[] = { static struct attribute *scif_dev_attrs[] = {
...@@ -144,7 +141,8 @@ struct scif_hw_dev * ...@@ -144,7 +141,8 @@ struct scif_hw_dev *
scif_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops, scif_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
struct scif_hw_ops *hw_ops, u8 dnode, u8 snode, struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
struct mic_mw *mmio, struct mic_mw *aper, void *dp, struct mic_mw *mmio, struct mic_mw *aper, void *dp,
void __iomem *rdp, struct dma_chan **chan, int num_chan) void __iomem *rdp, struct dma_chan **chan, int num_chan,
bool card_rel_da)
{ {
int ret; int ret;
struct scif_hw_dev *sdev; struct scif_hw_dev *sdev;
...@@ -171,6 +169,7 @@ scif_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops, ...@@ -171,6 +169,7 @@ scif_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
dma_set_mask(&sdev->dev, DMA_BIT_MASK(64)); dma_set_mask(&sdev->dev, DMA_BIT_MASK(64));
sdev->dma_ch = chan; sdev->dma_ch = chan;
sdev->num_dma_ch = num_chan; sdev->num_dma_ch = num_chan;
sdev->card_rel_da = card_rel_da;
dev_set_name(&sdev->dev, "scif-dev%u", sdev->dnode); dev_set_name(&sdev->dev, "scif-dev%u", sdev->dnode);
/* /*
* device_register() causes the bus infrastructure to look for a * device_register() causes the bus infrastructure to look for a
......
...@@ -46,6 +46,8 @@ struct scif_hw_dev_id { ...@@ -46,6 +46,8 @@ struct scif_hw_dev_id {
* @rdp - Remote device page * @rdp - Remote device page
* @dma_ch - Array of DMA channels * @dma_ch - Array of DMA channels
* @num_dma_ch - Number of DMA channels available * @num_dma_ch - Number of DMA channels available
* @card_rel_da - Set to true if DMA addresses programmed in the DMA engine
* are relative to the card point of view
*/ */
struct scif_hw_dev { struct scif_hw_dev {
struct scif_hw_ops *hw_ops; struct scif_hw_ops *hw_ops;
...@@ -59,6 +61,7 @@ struct scif_hw_dev { ...@@ -59,6 +61,7 @@ struct scif_hw_dev {
void __iomem *rdp; void __iomem *rdp;
struct dma_chan **dma_ch; struct dma_chan **dma_ch;
int num_dma_ch; int num_dma_ch;
bool card_rel_da;
}; };
/** /**
...@@ -114,7 +117,8 @@ scif_register_device(struct device *pdev, int id, ...@@ -114,7 +117,8 @@ scif_register_device(struct device *pdev, int id,
struct scif_hw_ops *hw_ops, u8 dnode, u8 snode, struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
struct mic_mw *mmio, struct mic_mw *aper, struct mic_mw *mmio, struct mic_mw *aper,
void *dp, void __iomem *rdp, void *dp, void __iomem *rdp,
struct dma_chan **chan, int num_chan); struct dma_chan **chan, int num_chan,
bool card_rel_da);
void scif_unregister_device(struct scif_hw_dev *sdev); void scif_unregister_device(struct scif_hw_dev *sdev);
static inline struct scif_hw_dev *dev_to_scif(struct device *dev) static inline struct scif_hw_dev *dev_to_scif(struct device *dev)
......
...@@ -75,12 +75,7 @@ struct mic_device_ctrl { ...@@ -75,12 +75,7 @@ struct mic_device_ctrl {
* struct mic_bootparam: Virtio device independent information in device page * struct mic_bootparam: Virtio device independent information in device page
* *
* @magic: A magic value used by the card to ensure it can see the host * @magic: A magic value used by the card to ensure it can see the host
* @c2h_shutdown_db: Card to Host shutdown doorbell set by host
* @h2c_shutdown_db: Host to Card shutdown doorbell set by card
* @h2c_config_db: Host to Card Virtio config doorbell set by card * @h2c_config_db: Host to Card Virtio config doorbell set by card
* @shutdown_status: Card shutdown status set by card
* @shutdown_card: Set to 1 by the host when a card shutdown is initiated
* @tot_nodes: Total number of nodes in the SCIF network
* @node_id: Unique id of the node * @node_id: Unique id of the node
* @h2c_scif_db - Host to card SCIF doorbell set by card * @h2c_scif_db - Host to card SCIF doorbell set by card
* @c2h_scif_db - Card to host SCIF doorbell set by host * @c2h_scif_db - Card to host SCIF doorbell set by host
...@@ -89,12 +84,7 @@ struct mic_device_ctrl { ...@@ -89,12 +84,7 @@ struct mic_device_ctrl {
*/ */
struct mic_bootparam { struct mic_bootparam {
__le32 magic; __le32 magic;
__s8 c2h_shutdown_db;
__s8 h2c_shutdown_db;
__s8 h2c_config_db; __s8 h2c_config_db;
__u8 shutdown_status;
__u8 shutdown_card;
__u8 tot_nodes;
__u8 node_id; __u8 node_id;
__u8 h2c_scif_db; __u8 h2c_scif_db;
__u8 c2h_scif_db; __u8 c2h_scif_db;
...@@ -219,12 +209,12 @@ static inline unsigned mic_total_desc_size(struct mic_device_desc *desc) ...@@ -219,12 +209,12 @@ static inline unsigned mic_total_desc_size(struct mic_device_desc *desc)
* enum mic_states - MIC states. * enum mic_states - MIC states.
*/ */
enum mic_states { enum mic_states {
MIC_OFFLINE = 0, MIC_READY = 0,
MIC_BOOTING,
MIC_ONLINE, MIC_ONLINE,
MIC_SHUTTING_DOWN, MIC_SHUTTING_DOWN,
MIC_RESETTING,
MIC_RESET_FAILED, MIC_RESET_FAILED,
MIC_SUSPENDING,
MIC_SUSPENDED,
MIC_LAST MIC_LAST
}; };
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment